LSHTM_analysis/scripts/ml/combined_model/log_cm_skf.txt

39255 lines
1.6 MiB

/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
from pandas import MultiIndex, Int64Index
1.22.4
1.4.1
aaindex_df contains non-numerical data
Total no. of non-numerial columns: 2
Selecting numerical data only
PASS: successfully selected numerical columns only for aaindex_df
Now checking for NA in the remaining aaindex_cols
Counting aaindex_df cols with NA
ncols with NA: 4 columns
Dropping these...
Original ncols: 127
Revised df ncols: 123
Checking NA in revised df...
PASS: cols with NA successfully dropped from aaindex_df
Proceeding with combining aa_df with other features_df
PASS: ncols match
Expected ncols: 123
Got: 123
Total no. of columns in clean aa_df: 123
Proceeding to merge, expected nrows in merged_df: 424
PASS: my_features_df and aa_df successfully combined
nrows: 424
ncols: 267
count of NULL values before imputation
or_mychisq 102
log10_or_mychisq 102
dtype: int64
count of NULL values AFTER imputation
mutationinformation 0
or_rawI 0
logorI 0
dtype: int64
PASS: OR values imputed, data ready for ML
Total no. of features for aaindex: 123
Genomic features being used EXCLUDING odds ratio (n): 6
These are: ['maf', 'lineage_proportion', 'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique', 'gene_name']
dst column exists
and this is identical to drug column: pyrazinamide
All feature names: ['consurf_score', 'snap2_score', 'provean_score', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'contacts', 'electro_rr', 'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr', 'disulfide_mm', 'disulfide_sm', 'disulfide_ss', 'hbonds_rr', 'hbonds_mm', 'hbonds_sm', 'hbonds_ss', 'partcov_rr', 'partcov_mm', 'partcov_sm', 'partcov_ss', 'vdwclashes_rr', 'vdwclashes_mm', 'vdwclashes_sm', 'vdwclashes_ss', 'volumetric_rr', 'volumetric_mm', 'volumetric_ss', 'ligand_distance', 'ligand_affinity_change', 'mmcsm_lig', 'ALTS910101', 'AZAE970101', 'AZAE970102', 'BASU010101', 'BENS940101', 'BENS940102', 'BENS940103', 'BENS940104', 'BETM990101', 'BLAJ010101', 'BONM030101', 'BONM030102', 'BONM030103', 'BONM030104', 'BONM030105', 'BONM030106', 'BRYS930101', 'CROG050101', 'CSEM940101', 'DAYM780301', 'DAYM780302', 'DOSZ010101', 'DOSZ010102', 'DOSZ010103', 'DOSZ010104', 'FEND850101', 'FITW660101', 'GEOD900101', 'GIAG010101', 'GONG920101', 'GRAR740104', 'HENS920101', 'HENS920102', 'HENS920103', 'HENS920104', 'JOHM930101', 'JOND920103', 'JOND940101', 'KANM000101', 'KAPO950101', 'KESO980101', 'KESO980102', 'KOLA920101', 'KOLA930101', 'KOSJ950100_RSA_SST', 'KOSJ950100_SST', 'KOSJ950110_RSA', 'KOSJ950115', 'LEVJ860101', 'LINK010101', 'LIWA970101', 'LUTR910101', 'LUTR910102', 'LUTR910103', 'LUTR910104', 'LUTR910105', 'LUTR910106', 'LUTR910107', 'LUTR910108', 'LUTR910109', 'MCLA710101', 'MCLA720101', 'MEHP950102', 'MICC010101', 'MIRL960101', 'MIYS850102', 'MIYS850103', 'MIYS930101', 'MIYS960101', 'MIYS960102', 'MIYS960103', 'MIYS990106', 'MIYS990107', 'MIYT790101', 'MOHR870101', 'MOOG990101', 'MUET010101', 'MUET020101', 'MUET020102', 'NAOD960101', 'NGPC000101', 'NIEK910101', 'NIEK910102', 'OGAK980101', 'OVEJ920100_RSA', 'OVEJ920101', 'OVEJ920102', 'OVEJ920103', 'PRLA000101', 'PRLA000102', 'QUIB020101', 'QU_C930101', 'QU_C930102', 'QU_C930103', 'RIER950101', 'RISJ880101', 'RUSR970101', 'RUSR970102', 'RUSR970103', 'SIMK990101', 'SIMK990102', 'SIMK990103', 'SIMK990104', 'SIMK990105', 'SKOJ000101', 'SKOJ000102', 'SKOJ970101', 'TANS760101', 'TANS760102', 'THOP960101', 'TOBD000101', 'TOBD000102', 'TUDE900101', 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106', 'rsa', 'kd_values', 'rd_values', 'ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'active_site', 'maf', 'lineage_proportion', 'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique', 'gene_name', 'dst', 'dst_mode']
PASS: but NOT writing mask file
PASS: But NOT writing processed file
#################################################################
SUCCESS: Extacted training data for gene: pnca
Dim of training_df: (424, 174)
This EXCLUDES Odds Ratio
############################################################
aaindex_df contains non-numerical data
Total no. of non-numerial columns: 2
Selecting numerical data only
PASS: successfully selected numerical columns only for aaindex_df
Now checking for NA in the remaining aaindex_cols
Counting aaindex_df cols with NA
ncols with NA: 4 columns
Dropping these...
Original ncols: 127
Revised df ncols: 123
Checking NA in revised df...
PASS: cols with NA successfully dropped from aaindex_df
Proceeding with combining aa_df with other features_df
PASS: ncols match
Expected ncols: 123
Got: 123
Total no. of columns in clean aa_df: 123
Proceeding to merge, expected nrows in merged_df: 858
PASS: my_features_df and aa_df successfully combined
nrows: 858
ncols: 271
count of NULL values before imputation
or_mychisq 244
log10_or_mychisq 244
dtype: int64
count of NULL values AFTER imputation
mutationinformation 0
or_rawI 0
logorI 0
dtype: int64
PASS: OR values imputed, data ready for ML
Total no. of features for aaindex: 123
Genomic features being used EXCLUDING odds ratio (n): 6
These are: ['maf', 'lineage_proportion', 'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique', 'gene_name']
dst column exists
and this is identical to drug column: ethambutol
All feature names: ['consurf_score', 'snap2_score', 'provean_score', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'contacts', 'electro_rr', 'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr', 'disulfide_mm', 'disulfide_sm', 'disulfide_ss', 'hbonds_rr', 'hbonds_mm', 'hbonds_sm', 'hbonds_ss', 'partcov_rr', 'partcov_mm', 'partcov_sm', 'partcov_ss', 'vdwclashes_rr', 'vdwclashes_mm', 'vdwclashes_sm', 'vdwclashes_ss', 'volumetric_rr', 'volumetric_mm', 'volumetric_ss', 'ligand_distance', 'ligand_affinity_change', 'mmcsm_lig', 'mcsm_ppi2_affinity', 'interface_dist', 'ALTS910101', 'AZAE970101', 'AZAE970102', 'BASU010101', 'BENS940101', 'BENS940102', 'BENS940103', 'BENS940104', 'BETM990101', 'BLAJ010101', 'BONM030101', 'BONM030102', 'BONM030103', 'BONM030104', 'BONM030105', 'BONM030106', 'BRYS930101', 'CROG050101', 'CSEM940101', 'DAYM780301', 'DAYM780302', 'DOSZ010101', 'DOSZ010102', 'DOSZ010103', 'DOSZ010104', 'FEND850101', 'FITW660101', 'GEOD900101', 'GIAG010101', 'GONG920101', 'GRAR740104', 'HENS920101', 'HENS920102', 'HENS920103', 'HENS920104', 'JOHM930101', 'JOND920103', 'JOND940101', 'KANM000101', 'KAPO950101', 'KESO980101', 'KESO980102', 'KOLA920101', 'KOLA930101', 'KOSJ950100_RSA_SST', 'KOSJ950100_SST', 'KOSJ950110_RSA', 'KOSJ950115', 'LEVJ860101', 'LINK010101', 'LIWA970101', 'LUTR910101', 'LUTR910102', 'LUTR910103', 'LUTR910104', 'LUTR910105', 'LUTR910106', 'LUTR910107', 'LUTR910108', 'LUTR910109', 'MCLA710101', 'MCLA720101', 'MEHP950102', 'MICC010101', 'MIRL960101', 'MIYS850102', 'MIYS850103', 'MIYS930101', 'MIYS960101', 'MIYS960102', 'MIYS960103', 'MIYS990106', 'MIYS990107', 'MIYT790101', 'MOHR870101', 'MOOG990101', 'MUET010101', 'MUET020101', 'MUET020102', 'NAOD960101', 'NGPC000101', 'NIEK910101', 'NIEK910102', 'OGAK980101', 'OVEJ920100_RSA', 'OVEJ920101', 'OVEJ920102', 'OVEJ920103', 'PRLA000101', 'PRLA000102', 'QUIB020101', 'QU_C930101', 'QU_C930102', 'QU_C930103', 'RIER950101', 'RISJ880101', 'RUSR970101', 'RUSR970102', 'RUSR970103', 'SIMK990101', 'SIMK990102', 'SIMK990103', 'SIMK990104', 'SIMK990105', 'SKOJ000101', 'SKOJ000102', 'SKOJ970101', 'TANS760101', 'TANS760102', 'THOP960101', 'TOBD000101', 'TOBD000102', 'TUDE900101', 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106', 'rsa', 'kd_values', 'rd_values', 'ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'active_site', 'maf', 'lineage_proportion', 'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique', 'gene_name', 'dst', 'dst_mode']
PASS: but NOT writing mask file
PASS: But NOT writing processed file
#################################################################
SUCCESS: Extacted training data for gene: embb
Dim of training_df: (858, 176)
This EXCLUDES Odds Ratio
############################################################
aaindex_df contains non-numerical data
Total no. of non-numerial columns: 2
Selecting numerical data only
PASS: successfully selected numerical columns only for aaindex_df
Now checking for NA in the remaining aaindex_cols
Counting aaindex_df cols with NA
ncols with NA: 4 columns
Dropping these...
Original ncols: 127
Revised df ncols: 123
Checking NA in revised df...
PASS: cols with NA successfully dropped from aaindex_df
Proceeding with combining aa_df with other features_df
PASS: ncols match
Expected ncols: 123
Got: 123
Total no. of columns in clean aa_df: 123
Proceeding to merge, expected nrows in merged_df: 817
PASS: my_features_df and aa_df successfully combined
nrows: 817
ncols: 271
count of NULL values before imputation
or_mychisq 244
log10_or_mychisq 244
dtype: int64
count of NULL values AFTER imputation
mutationinformation 0
or_rawI 0
logorI 0
dtype: int64
PASS: OR values imputed, data ready for ML
Total no. of features for aaindex: 123
Genomic features being used EXCLUDING odds ratio (n): 6
These are: ['maf', 'lineage_proportion', 'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique', 'gene_name']
dst column exists
and this is identical to drug column: isoniazid
All feature names: ['consurf_score', 'snap2_score', 'provean_score', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'contacts', 'electro_rr', 'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr', 'disulfide_mm', 'disulfide_sm', 'disulfide_ss', 'hbonds_rr', 'hbonds_mm', 'hbonds_sm', 'hbonds_ss', 'partcov_rr', 'partcov_mm', 'partcov_sm', 'partcov_ss', 'vdwclashes_rr', 'vdwclashes_mm', 'vdwclashes_sm', 'vdwclashes_ss', 'volumetric_rr', 'volumetric_mm', 'volumetric_ss', 'ligand_distance', 'ligand_affinity_change', 'mmcsm_lig', 'mcsm_ppi2_affinity', 'interface_dist', 'ALTS910101', 'AZAE970101', 'AZAE970102', 'BASU010101', 'BENS940101', 'BENS940102', 'BENS940103', 'BENS940104', 'BETM990101', 'BLAJ010101', 'BONM030101', 'BONM030102', 'BONM030103', 'BONM030104', 'BONM030105', 'BONM030106', 'BRYS930101', 'CROG050101', 'CSEM940101', 'DAYM780301', 'DAYM780302', 'DOSZ010101', 'DOSZ010102', 'DOSZ010103', 'DOSZ010104', 'FEND850101', 'FITW660101', 'GEOD900101', 'GIAG010101', 'GONG920101', 'GRAR740104', 'HENS920101', 'HENS920102', 'HENS920103', 'HENS920104', 'JOHM930101', 'JOND920103', 'JOND940101', 'KANM000101', 'KAPO950101', 'KESO980101', 'KESO980102', 'KOLA920101', 'KOLA930101', 'KOSJ950100_RSA_SST', 'KOSJ950100_SST', 'KOSJ950110_RSA', 'KOSJ950115', 'LEVJ860101', 'LINK010101', 'LIWA970101', 'LUTR910101', 'LUTR910102', 'LUTR910103', 'LUTR910104', 'LUTR910105', 'LUTR910106', 'LUTR910107', 'LUTR910108', 'LUTR910109', 'MCLA710101', 'MCLA720101', 'MEHP950102', 'MICC010101', 'MIRL960101', 'MIYS850102', 'MIYS850103', 'MIYS930101', 'MIYS960101', 'MIYS960102', 'MIYS960103', 'MIYS990106', 'MIYS990107', 'MIYT790101', 'MOHR870101', 'MOOG990101', 'MUET010101', 'MUET020101', 'MUET020102', 'NAOD960101', 'NGPC000101', 'NIEK910101', 'NIEK910102', 'OGAK980101', 'OVEJ920100_RSA', 'OVEJ920101', 'OVEJ920102', 'OVEJ920103', 'PRLA000101', 'PRLA000102', 'QUIB020101', 'QU_C930101', 'QU_C930102', 'QU_C930103', 'RIER950101', 'RISJ880101', 'RUSR970101', 'RUSR970102', 'RUSR970103', 'SIMK990101', 'SIMK990102', 'SIMK990103', 'SIMK990104', 'SIMK990105', 'SKOJ000101', 'SKOJ000102', 'SKOJ970101', 'TANS760101', 'TANS760102', 'THOP960101', 'TOBD000101', 'TOBD000102', 'TUDE900101', 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106', 'rsa', 'kd_values', 'rd_values', 'ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'active_site', 'maf', 'lineage_proportion', 'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique', 'gene_name', 'dst', 'dst_mode']
PASS: but NOT writing mask file
PASS: But NOT writing processed file
#################################################################
SUCCESS: Extacted training data for gene: katg
Dim of training_df: (817, 176)
This EXCLUDES Odds Ratio
############################################################
aaindex_df contains non-numerical data
Total no. of non-numerial columns: 2
Selecting numerical data only
PASS: successfully selected numerical columns only for aaindex_df
Now checking for NA in the remaining aaindex_cols
Counting aaindex_df cols with NA
ncols with NA: 4 columns
Dropping these...
Original ncols: 127
Revised df ncols: 123
Checking NA in revised df...
PASS: cols with NA successfully dropped from aaindex_df
Proceeding with combining aa_df with other features_df
PASS: ncols match
Expected ncols: 123
Got: 123
Total no. of columns in clean aa_df: 123
Proceeding to merge, expected nrows in merged_df: 1133
PASS: my_features_df and aa_df successfully combined
nrows: 1133
ncols: 276
count of NULL values before imputation
or_mychisq 339
log10_or_mychisq 339
dtype: int64
count of NULL values AFTER imputation
mutationinformation 0
or_rawI 0
logorI 0
dtype: int64
PASS: OR values imputed, data ready for ML
Total no. of features for aaindex: 123
Genomic features being used EXCLUDING odds ratio (n): 6
These are: ['maf', 'lineage_proportion', 'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique', 'gene_name']
dst column exists
and this is identical to drug column: rifampicin
All feature names: ['consurf_score', 'snap2_score', 'provean_score', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'contacts', 'electro_rr', 'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr', 'disulfide_mm', 'disulfide_sm', 'disulfide_ss', 'hbonds_rr', 'hbonds_mm', 'hbonds_sm', 'hbonds_ss', 'partcov_rr', 'partcov_mm', 'partcov_sm', 'partcov_ss', 'vdwclashes_rr', 'vdwclashes_mm', 'vdwclashes_sm', 'vdwclashes_ss', 'volumetric_rr', 'volumetric_mm', 'volumetric_ss', 'ligand_distance', 'ligand_affinity_change', 'mmcsm_lig', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'ALTS910101', 'AZAE970101', 'AZAE970102', 'BASU010101', 'BENS940101', 'BENS940102', 'BENS940103', 'BENS940104', 'BETM990101', 'BLAJ010101', 'BONM030101', 'BONM030102', 'BONM030103', 'BONM030104', 'BONM030105', 'BONM030106', 'BRYS930101', 'CROG050101', 'CSEM940101', 'DAYM780301', 'DAYM780302', 'DOSZ010101', 'DOSZ010102', 'DOSZ010103', 'DOSZ010104', 'FEND850101', 'FITW660101', 'GEOD900101', 'GIAG010101', 'GONG920101', 'GRAR740104', 'HENS920101', 'HENS920102', 'HENS920103', 'HENS920104', 'JOHM930101', 'JOND920103', 'JOND940101', 'KANM000101', 'KAPO950101', 'KESO980101', 'KESO980102', 'KOLA920101', 'KOLA930101', 'KOSJ950100_RSA_SST', 'KOSJ950100_SST', 'KOSJ950110_RSA', 'KOSJ950115', 'LEVJ860101', 'LINK010101', 'LIWA970101', 'LUTR910101', 'LUTR910102', 'LUTR910103', 'LUTR910104', 'LUTR910105', 'LUTR910106', 'LUTR910107', 'LUTR910108', 'LUTR910109', 'MCLA710101', 'MCLA720101', 'MEHP950102', 'MICC010101', 'MIRL960101', 'MIYS850102', 'MIYS850103', 'MIYS930101', 'MIYS960101', 'MIYS960102', 'MIYS960103', 'MIYS990106', 'MIYS990107', 'MIYT790101', 'MOHR870101', 'MOOG990101', 'MUET010101', 'MUET020101', 'MUET020102', 'NAOD960101', 'NGPC000101', 'NIEK910101', 'NIEK910102', 'OGAK980101', 'OVEJ920100_RSA', 'OVEJ920101', 'OVEJ920102', 'OVEJ920103', 'PRLA000101', 'PRLA000102', 'QUIB020101', 'QU_C930101', 'QU_C930102', 'QU_C930103', 'RIER950101', 'RISJ880101', 'RUSR970101', 'RUSR970102', 'RUSR970103', 'SIMK990101', 'SIMK990102', 'SIMK990103', 'SIMK990104', 'SIMK990105', 'SKOJ000101', 'SKOJ000102', 'SKOJ970101', 'TANS760101', 'TANS760102', 'THOP960101', 'TOBD000101', 'TOBD000102', 'TUDE900101', 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106', 'rsa', 'kd_values', 'rd_values', 'ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'active_site', 'maf', 'lineage_proportion', 'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique', 'gene_name', 'dst', 'dst_mode']
PASS: but NOT writing mask file
PASS: But NOT writing processed file
#################################################################
SUCCESS: Extacted training data for gene: rpob
Dim of training_df: (1132, 177)
This EXCLUDES Odds Ratio
############################################################
aaindex_df contains non-numerical data
Total no. of non-numerial columns: 2
Selecting numerical data only
PASS: successfully selected numerical columns only for aaindex_df
Now checking for NA in the remaining aaindex_cols
Counting aaindex_df cols with NA
ncols with NA: 4 columns
Dropping these...
Original ncols: 127
Revised df ncols: 123
Checking NA in revised df...
PASS: cols with NA successfully dropped from aaindex_df
Proceeding with combining aa_df with other features_df
PASS: ncols match
Expected ncols: 123
Got: 123
Total no. of columns in clean aa_df: 123
Proceeding to merge, expected nrows in merged_df: 531
PASS: my_features_df and aa_df successfully combined
nrows: 531
ncols: 288
count of NULL values before imputation
or_mychisq 263
log10_or_mychisq 263
dtype: int64
count of NULL values AFTER imputation
mutationinformation 0
or_rawI 0
logorI 0
dtype: int64
PASS: OR values imputed, data ready for ML
Total no. of features for aaindex: 123
Genomic features being used EXCLUDING odds ratio (n): 6
These are: ['maf', 'lineage_proportion', 'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique', 'gene_name']
dst column exists
and this is identical to drug column: streptomycin
All feature names: ['consurf_score', 'snap2_score', 'provean_score', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'contacts', 'electro_rr', 'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr', 'disulfide_mm', 'disulfide_sm', 'disulfide_ss', 'hbonds_rr', 'hbonds_mm', 'hbonds_sm', 'hbonds_ss', 'partcov_rr', 'partcov_mm', 'partcov_sm', 'partcov_ss', 'vdwclashes_rr', 'vdwclashes_mm', 'vdwclashes_sm', 'vdwclashes_ss', 'volumetric_rr', 'volumetric_mm', 'volumetric_ss', 'ligand_distance', 'ligand_affinity_change', 'mmcsm_lig', 'mcsm_na_affinity', 'ALTS910101', 'AZAE970101', 'AZAE970102', 'BASU010101', 'BENS940101', 'BENS940102', 'BENS940103', 'BENS940104', 'BETM990101', 'BLAJ010101', 'BONM030101', 'BONM030102', 'BONM030103', 'BONM030104', 'BONM030105', 'BONM030106', 'BRYS930101', 'CROG050101', 'CSEM940101', 'DAYM780301', 'DAYM780302', 'DOSZ010101', 'DOSZ010102', 'DOSZ010103', 'DOSZ010104', 'FEND850101', 'FITW660101', 'GEOD900101', 'GIAG010101', 'GONG920101', 'GRAR740104', 'HENS920101', 'HENS920102', 'HENS920103', 'HENS920104', 'JOHM930101', 'JOND920103', 'JOND940101', 'KANM000101', 'KAPO950101', 'KESO980101', 'KESO980102', 'KOLA920101', 'KOLA930101', 'KOSJ950100_RSA_SST', 'KOSJ950100_SST', 'KOSJ950110_RSA', 'KOSJ950115', 'LEVJ860101', 'LINK010101', 'LIWA970101', 'LUTR910101', 'LUTR910102', 'LUTR910103', 'LUTR910104', 'LUTR910105', 'LUTR910106', 'LUTR910107', 'LUTR910108', 'LUTR910109', 'MCLA710101', 'MCLA720101', 'MEHP950102', 'MICC010101', 'MIRL960101', 'MIYS850102', 'MIYS850103', 'MIYS930101', 'MIYS960101', 'MIYS960102', 'MIYS960103', 'MIYS990106', 'MIYS990107', 'MIYT790101', 'MOHR870101', 'MOOG990101', 'MUET010101', 'MUET020101', 'MUET020102', 'NAOD960101', 'NGPC000101', 'NIEK910101', 'NIEK910102', 'OGAK980101', 'OVEJ920100_RSA', 'OVEJ920101', 'OVEJ920102', 'OVEJ920103', 'PRLA000101', 'PRLA000102', 'QUIB020101', 'QU_C930101', 'QU_C930102', 'QU_C930103', 'RIER950101', 'RISJ880101', 'RUSR970101', 'RUSR970102', 'RUSR970103', 'SIMK990101', 'SIMK990102', 'SIMK990103', 'SIMK990104', 'SIMK990105', 'SKOJ000101', 'SKOJ000102', 'SKOJ970101', 'TANS760101', 'TANS760102', 'THOP960101', 'TOBD000101', 'TOBD000102', 'TUDE900101', 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106', 'rsa', 'kd_values', 'rd_values', 'ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'active_site', 'maf', 'lineage_proportion', 'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique', 'gene_name', 'dst', 'dst_mode']
PASS: but NOT writing mask file
PASS: But NOT writing processed file
#################################################################
SUCCESS: Extacted training data for gene: gid
Dim of training_df: (531, 175)
This EXCLUDES Odds Ratio
############################################################
aaindex_df contains non-numerical data
Total no. of non-numerial columns: 2
Selecting numerical data only
PASS: successfully selected numerical columns only for aaindex_df
Now checking for NA in the remaining aaindex_cols
Counting aaindex_df cols with NA
ncols with NA: 4 columns
Dropping these...
Original ncols: 127
Revised df ncols: 123
Checking NA in revised df...
PASS: cols with NA successfully dropped from aaindex_df
Proceeding with combining aa_df with other features_df
PASS: ncols match
Expected ncols: 123
Got: 123
Total no. of columns in clean aa_df: 123
Proceeding to merge, expected nrows in merged_df: 271
PASS: my_features_df and aa_df successfully combined
nrows: 271
ncols: 271
count of NULL values before imputation
or_mychisq 256
log10_or_mychisq 256
dtype: int64
count of NULL values AFTER imputation
mutationinformation 0
or_rawI 0
logorI 0
dtype: int64
PASS: OR values imputed, data ready for ML
Total no. of features for aaindex: 123
Genomic features being used EXCLUDING odds ratio (n): 6
These are: ['maf', 'lineage_proportion', 'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique', 'gene_name']
dst column exists
and this is identical to drug column: cycloserine
All feature names: ['consurf_score', 'snap2_score', 'provean_score', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'contacts', 'electro_rr', 'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr', 'disulfide_mm', 'disulfide_sm', 'disulfide_ss', 'hbonds_rr', 'hbonds_mm', 'hbonds_sm', 'hbonds_ss', 'partcov_rr', 'partcov_mm', 'partcov_sm', 'partcov_ss', 'vdwclashes_rr', 'vdwclashes_mm', 'vdwclashes_sm', 'vdwclashes_ss', 'volumetric_rr', 'volumetric_mm', 'volumetric_ss', 'ligand_distance', 'ligand_affinity_change', 'mmcsm_lig', 'mcsm_ppi2_affinity', 'interface_dist', 'ALTS910101', 'AZAE970101', 'AZAE970102', 'BASU010101', 'BENS940101', 'BENS940102', 'BENS940103', 'BENS940104', 'BETM990101', 'BLAJ010101', 'BONM030101', 'BONM030102', 'BONM030103', 'BONM030104', 'BONM030105', 'BONM030106', 'BRYS930101', 'CROG050101', 'CSEM940101', 'DAYM780301', 'DAYM780302', 'DOSZ010101', 'DOSZ010102', 'DOSZ010103', 'DOSZ010104', 'FEND850101', 'FITW660101', 'GEOD900101', 'GIAG010101', 'GONG920101', 'GRAR740104', 'HENS920101', 'HENS920102', 'HENS920103', 'HENS920104', 'JOHM930101', 'JOND920103', 'JOND940101', 'KANM000101', 'KAPO950101', 'KESO980101', 'KESO980102', 'KOLA920101', 'KOLA930101', 'KOSJ950100_RSA_SST', 'KOSJ950100_SST', 'KOSJ950110_RSA', 'KOSJ950115', 'LEVJ860101', 'LINK010101', 'LIWA970101', 'LUTR910101', 'LUTR910102', 'LUTR910103', 'LUTR910104', 'LUTR910105', 'LUTR910106', 'LUTR910107', 'LUTR910108', 'LUTR910109', 'MCLA710101', 'MCLA720101', 'MEHP950102', 'MICC010101', 'MIRL960101', 'MIYS850102', 'MIYS850103', 'MIYS930101', 'MIYS960101', 'MIYS960102', 'MIYS960103', 'MIYS990106', 'MIYS990107', 'MIYT790101', 'MOHR870101', 'MOOG990101', 'MUET010101', 'MUET020101', 'MUET020102', 'NAOD960101', 'NGPC000101', 'NIEK910101', 'NIEK910102', 'OGAK980101', 'OVEJ920100_RSA', 'OVEJ920101', 'OVEJ920102', 'OVEJ920103', 'PRLA000101', 'PRLA000102', 'QUIB020101', 'QU_C930101', 'QU_C930102', 'QU_C930103', 'RIER950101', 'RISJ880101', 'RUSR970101', 'RUSR970102', 'RUSR970103', 'SIMK990101', 'SIMK990102', 'SIMK990103', 'SIMK990104', 'SIMK990105', 'SKOJ000101', 'SKOJ000102', 'SKOJ970101', 'TANS760101', 'TANS760102', 'THOP960101', 'TOBD000101', 'TOBD000102', 'TUDE900101', 'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101', 'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106', 'rsa', 'kd_values', 'rd_values', 'ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'active_site', 'maf', 'lineage_proportion', 'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique', 'gene_name', 'dst', 'dst_mode']
PASS: but NOT writing mask file
PASS: But NOT writing processed file
#################################################################
SUCCESS: Extacted training data for gene: alr
Dim of training_df: (271, 176)
This EXCLUDES Odds Ratio
############################################################
Proceeding to combine based on common cols (n): 174
Successfully combined dfs:
No. of dfs combined: 6
Dim of combined df: (4033, 174)
Gene name included
BTS gene: embb
Total genes: 6
Training on: 5
Training on genes: ['alr', 'katg', 'pnca', 'gid', 'rpob']
Omitted genes: ['embb']
Blind test gene: embb
/home/tanu/git/Data/ml_combined/6genes_logo_skf_BT_embb.csv
Training data dim: (3175, 171)
Training Target dim: (3175,)
Checked training df does NOT have Target var
TEST data dim: (858, 171)
TEST Target dim: (858,)
==============================================================
Running several classification models (n): 24
List of models:
('AdaBoost Classifier', AdaBoostClassifier(random_state=42))
('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42))
('Decision Tree', DecisionTreeClassifier(random_state=42))
('Extra Tree', ExtraTreeClassifier(random_state=42))
('Extra Trees', ExtraTreesClassifier(random_state=42))
('Gradient Boosting', GradientBoostingClassifier(random_state=42))
('Gaussian NB', GaussianNB())
('Gaussian Process', GaussianProcessClassifier(random_state=42))
('K-Nearest Neighbors', KNeighborsClassifier())
('LDA', LinearDiscriminantAnalysis())
('Logistic Regression', LogisticRegression(random_state=42))
('Logistic RegressionCV', LogisticRegressionCV(cv=3, random_state=42))
('MLP', MLPClassifier(max_iter=500, random_state=42))
('Multinomial', MultinomialNB())
('Naive Bayes', BernoulliNB())
('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42))
('QDA', QuadraticDiscriminantAnalysis())
('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42))
('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
n_estimators=1000, n_jobs=10, oob_score=True,
random_state=42))
('Ridge Classifier', RidgeClassifier(random_state=42))
('Ridge ClassifierCV', RidgeClassifierCV(cv=3))
('SVC', SVC(random_state=42))
('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42))
('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
colsample_bynode=None, colsample_bytree=None,
enable_categorical=False, gamma=None, gpu_id=None,
importance_type=None, interaction_constraints=None,
learning_rate=None, max_delta_step=None, max_depth=None,
min_child_weight=None, missing=nan, monotone_constraints=None,
n_estimators=100, n_jobs=None, num_parallel_tree=None,
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
scale_pos_weight=None, subsample=None, tree_method=None,
use_label_encoder=False, validate_parameters=None, verbosity=0))
================================================================
Running classifier: 1
Model_name: AdaBoost Classifier
Model func: AdaBoostClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', AdaBoostClassifier(random_state=42))])
key: fit_time
value: [0.82235312 0.82206702 0.73253083 0.81550765 0.82528734 0.77237988
0.78904128 0.71050978 0.79281211 0.73421216]
mean value: 0.7816701173782349
key: score_time
value: [0.01858354 0.01903391 0.01995635 0.01969576 0.02019024 0.01824546
0.01880765 0.01854897 0.01869416 0.01841807]
mean value: 0.019017410278320313
key: test_mcc
value: [0.34923246 0.3772473 0.49360881 0.47101942 0.53598749 0.46690849
0.2359221 0.45754152 0.38489879 0.41644626]
mean value: 0.4188812630162668
key: train_mcc
value: [0.51401256 0.51922653 0.5144182 0.48292483 0.47709175 0.50863689
0.51692073 0.524495 0.51180099 0.50576309]
mean value: 0.5075290568809663
key: test_fscore
value: [0.52272727 0.56701031 0.62921348 0.61452514 0.67368421 0.61956522
0.44318182 0.61702128 0.56842105 0.54878049]
mean value: 0.5804130267948135
key: train_fscore
value: [0.64868179 0.65365854 0.64954128 0.62759463 0.62747451 0.64858348
0.65073529 0.66066066 0.64749082 0.64069264]
mean value: 0.6455113643566246
key: test_precision
value: [0.5974026 0.57894737 0.70886076 0.6875 0.7032967 0.6627907
0.5 0.65168539 0.59340659 0.69230769]
mean value: 0.6376197805261155
key: train_precision
value: [0.71293801 0.71371505 0.71179625 0.68624833 0.6722365 0.69960988
0.71563342 0.70876289 0.71006711 0.71153846]
mean value: 0.7042545901984184
key: test_recall
value: [0.46464646 0.55555556 0.56565657 0.55555556 0.64646465 0.58163265
0.39795918 0.58585859 0.54545455 0.45454545]
mean value: 0.5353329210472068
key: train_recall
value: [0.59505062 0.60292463 0.59730034 0.57817773 0.58830146 0.60449438
0.59662921 0.61867267 0.59505062 0.58267717]
mean value: 0.5959278826101794
key: test_accuracy
value: [0.73584906 0.73584906 0.79245283 0.78301887 0.80503145 0.77917981
0.69085174 0.77287066 0.74132492 0.76656151]
mean value: 0.7602989901394758
key: train_accuracy
value: [0.79943997 0.80119006 0.79943997 0.78648932 0.78263913 0.7960112
0.80055983 0.80230931 0.79846046 0.79671099]
mean value: 0.7963250244387657
key: test_roc_auc
value: [0.66154698 0.68645358 0.73031687 0.72070015 0.76158849 0.72460628
0.6099385 0.72182838 0.68786489 0.68140117]
mean value: 0.6986245267964868
key: train_roc_auc
value: [0.74340946 0.74683833 0.74402618 0.72938358 0.72936415 0.74355817
0.74470688 0.7519468 0.74267513 0.73801202]
mean value: 0.7413920696728387
key: test_jcc
value: [0.35384615 0.39568345 0.45901639 0.44354839 0.50793651 0.4488189
0.28467153 0.44615385 0.39705882 0.37815126]
mean value: 0.41148852562314386
key: train_jcc
value: [0.4800363 0.48550725 0.48097826 0.45729537 0.45716783 0.47992864
0.48228883 0.49327354 0.47873303 0.47133758]
mean value: 0.47665466280983476
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
MCC on Blind test: 0.16
Accuracy on Blind test: 0.81
Running classifier: 2
Model_name: Bagging Classifier
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model',
BaggingClassifier(n_jobs=10, oob_score=True,
random_state=42))])
key: fit_time
value: [0.38036156 0.39023089 0.38575459 0.37192464 0.38617873 0.36494088
0.3846159 0.28515506 0.37886834 0.38482165]
mean value: 0.3712852239608765
key: score_time
value: [0.04513955 0.03753614 0.04251885 0.03930163 0.04536176 0.0319047
0.03785133 0.04319596 0.03059578 0.04125047]
mean value: 0.03946561813354492
key: test_mcc
value: [0.37393026 0.43156873 0.4084137 0.41870153 0.44596625 0.42377201
0.25953351 0.33251179 0.47517841 0.29428873]
mean value: 0.38638649154026067
key: train_mcc
value: [0.95010231 0.96003142 0.95431125 0.9544201 0.95420452 0.95597188
0.94781699 0.95667972 0.95523716 0.95833207]
mean value: 0.9547107409048972
key: test_fscore
value: [0.51533742 0.59459459 0.5698324 0.55421687 0.57988166 0.57471264
0.46666667 0.51933702 0.63157895 0.47674419]
mean value: 0.5482902404751074
key: train_fscore
value: [0.96488198 0.97179044 0.96766744 0.96755504 0.96781609 0.96889401
0.96296296 0.96955773 0.96815287 0.97070649]
mean value: 0.967998504710831
key: test_precision
value: [0.65625 0.63953488 0.6375 0.68656716 0.7 0.65789474
0.51219512 0.57317073 0.65934066 0.56164384]
mean value: 0.6284097133357774
key: train_precision
value: [0.98820755 0.99528302 0.9940688 0.99761051 0.98942421 0.99408983
0.9928401 0.99061033 0.99761337 0.99178404]
mean value: 0.9931531749823851
key: test_recall
value: [0.42424242 0.55555556 0.51515152 0.46464646 0.49494949 0.51020408
0.42857143 0.47474747 0.60606061 0.41414141]
mean value: 0.48882704596990323
key: train_recall
value: [0.94263217 0.94938133 0.94263217 0.93925759 0.94713161 0.94494382
0.93483146 0.94938133 0.94038245 0.95050619]
mean value: 0.9441080117794265
key: test_accuracy
value: [0.75157233 0.76415094 0.75786164 0.7672956 0.77672956 0.76656151
0.69716088 0.72555205 0.77917981 0.71608833]
mean value: 0.750215264964387
key: train_accuracy
value: [0.97864893 0.98284914 0.98039902 0.98039902 0.98039902 0.98110567
0.97760672 0.98145556 0.98075577 0.98215535]
mean value: 0.9805774211033853
key: test_roc_auc
value: [0.6618929 0.70700152 0.69136571 0.68437803 0.69952954 0.69574131
0.62296151 0.65709851 0.73192939 0.63367621]
mean value: 0.6785574633309781
key: train_roc_auc
value: [0.96877544 0.9736744 0.97004576 0.96912067 0.97127922 0.97120158
0.96589134 0.97265918 0.96968335 0.97347554]
mean value: 0.9705806480092324
key: test_jcc
value: [0.34710744 0.42307692 0.3984375 0.38333333 0.40833333 0.40322581
0.30434783 0.35074627 0.46153846 0.3129771 ]
mean value: 0.37931239897305075
key: train_jcc
value: [0.93214683 0.94512878 0.93736018 0.93714927 0.9376392 0.9396648
0.92857143 0.94091416 0.9382716 0.94308036]
mean value: 0.9379926610305276
MCC on Blind test: 0.21
Accuracy on Blind test: 0.83
Running classifier: 3
Model_name: Decision Tree
Model func: DecisionTreeClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', DecisionTreeClassifier(random_state=42))])
key: fit_time
value: [0.22233462 0.19524002 0.1886189 0.18508697 0.18715334 0.18904018
0.1891408 0.19395995 0.2046504 0.18566108]
mean value: 0.19408862590789794
key: score_time
value: [0.0098896 0.00984883 0.00994802 0.00979733 0.00979114 0.00986004
0.00983858 0.00976801 0.00986052 0.00987196]
mean value: 0.009847402572631836
key: test_mcc
value: [0.38630967 0.25932185 0.25195348 0.26027106 0.36016923 0.29977309
0.23835404 0.23099758 0.29750549 0.17624578]
mean value: 0.27609012814561457
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.57435897 0.49756098 0.50458716 0.4950495 0.56585366 0.50793651
0.49056604 0.47236181 0.52216749 0.44444444]
mean value: 0.507488655626587
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.58333333 0.48113208 0.46218487 0.48543689 0.54716981 0.52747253
0.45614035 0.47 0.50961538 0.42592593]
mean value: 0.49484111761702804
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.56565657 0.51515152 0.55555556 0.50505051 0.58585859 0.48979592
0.53061224 0.47474747 0.53535354 0.46464646]
mean value: 0.5222428365285507
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.73899371 0.67610063 0.66037736 0.67924528 0.72012579 0.70662461
0.65930599 0.66876972 0.69400631 0.63722397]
mean value: 0.6840773366664683
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: [0.69150408 0.63200498 0.63165906 0.63152069 0.68334025 0.64672444
0.62375361 0.6158141 0.65070429 0.5901214 ]
mean value: 0.6397146904355292
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.4028777 0.33116883 0.33742331 0.32894737 0.39455782 0.34042553
0.325 0.30921053 0.35333333 0.28571429]
mean value: 0.34086587107225996
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: 0.03
Accuracy on Blind test: 0.68
Running classifier: 4
Model_name: Extra Tree
Model func: ExtraTreeClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', ExtraTreeClassifier(random_state=42))])
key: fit_time
value: [0.02443075 0.02303457 0.02403355 0.02386045 0.02345037 0.02333164
0.02373028 0.02308965 0.0232594 0.02335906]
mean value: 0.02355797290802002
key: score_time
value: [0.00973511 0.00976896 0.00972152 0.00969577 0.00970984 0.0097177
0.00977254 0.00974107 0.0097661 0.00977802]
mean value: 0.009740662574768067
key: test_mcc
value: [0.14416908 0.27568221 0.3470989 0.22784021 0.22616134 0.18216399
0.17225983 0.22261365 0.32650141 0.11879494]
mean value: 0.22432855735963822
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.41584158 0.51851852 0.56338028 0.46700508 0.4729064 0.44
0.43564356 0.48623853 0.54187192 0.38974359]
mean value: 0.47311494718424774
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.40776699 0.47863248 0.52631579 0.46938776 0.46153846 0.43137255
0.42307692 0.44537815 0.52884615 0.39583333]
mean value: 0.4568148585574449
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.42424242 0.56565657 0.60606061 0.46464646 0.48484848 0.44897959
0.44897959 0.53535354 0.55555556 0.38383838]
mean value: 0.49181612038754896
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.62893082 0.67295597 0.70754717 0.66981132 0.66352201 0.6466877
0.64037855 0.6466877 0.70662461 0.62460568]
mean value: 0.6607751522726821
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: [0.57285181 0.64355888 0.67974263 0.61360177 0.61457036 0.5920697
0.58750349 0.61630062 0.66539246 0.55889167]
mean value: 0.6144483391933975
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.2625 0.35 0.39215686 0.30463576 0.30967742 0.28205128
0.27848101 0.32121212 0.37162162 0.24203822]
mean value: 0.31143742977931027
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: 0.02
Accuracy on Blind test: 0.64
Running classifier: 5
Model_name: Extra Trees
Model func: ExtraTreesClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', ExtraTreesClassifier(random_state=42))])
key: fit_time
value: [0.48304725 0.48832083 0.48133349 0.48609257 0.49304461 0.4891398
0.48729038 0.4915247 0.48695421 0.48075628]
mean value: 0.4867504119873047
key: score_time
value: [0.02421784 0.02446222 0.02468348 0.02463937 0.02459335 0.02449298
0.02460098 0.02468538 0.02453279 0.02452064]
mean value: 0.024542903900146483
key: test_mcc
value: [0.30306286 0.410032 0.33807779 0.43867449 0.34005282 0.30394033
0.27901296 0.35223463 0.49299795 0.33471635]
mean value: 0.359280217503482
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.44155844 0.58064516 0.5 0.5477707 0.49079755 0.43708609
0.45121951 0.52023121 0.62921348 0.4939759 ]
mean value: 0.5092498055041689
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.61818182 0.62068966 0.60869565 0.74137931 0.625 0.62264151
0.56060606 0.60810811 0.70886076 0.6119403 ]
mean value: 0.6326103172022237
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.34343434 0.54545455 0.42424242 0.43434343 0.4040404 0.33673469
0.37755102 0.45454545 0.56565657 0.41414141]
mean value: 0.43001443001443
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.72955975 0.75471698 0.73584906 0.77672956 0.73899371 0.7318612
0.71608833 0.73817035 0.79179811 0.73501577]
mean value: 0.7448782810546991
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: [0.62377197 0.69738481 0.65047738 0.68292514 0.64722568 0.62270525
0.62256546 0.66075897 0.73007599 0.64743768]
mean value: 0.6585328318644895
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.28333333 0.40909091 0.33333333 0.37719298 0.32520325 0.27966102
0.29133858 0.3515625 0.45901639 0.328 ]
mean value: 0.3437732303315177
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: 0.19
Accuracy on Blind test: 0.82
Running classifier: 6
Model_name: Gradient Boosting
Model func: GradientBoostingClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', GradientBoostingClassifier(random_state=42))])
key: fit_time
value: [3.47058582 3.50572276 3.3582952 3.35970736 3.36448741 3.53767443
3.39990234 3.38683867 3.37601209 3.38574433]
mean value: 3.4144970417022704
key: score_time
value: [0.01134539 0.01128292 0.01058722 0.01064944 0.01040602 0.01100373
0.01090407 0.01035905 0.01087427 0.01092434]
mean value: 0.010833644866943359
key: test_mcc
value: [0.40814064 0.47845865 0.44234757 0.54799487 0.57869989 0.47565352
0.33430194 0.47517841 0.45395621 0.39023768]
mean value: 0.4584969402973254
key: train_mcc
value: [0.6589473 0.64854311 0.66215305 0.62053421 0.62792201 0.64579843
0.64751168 0.64402613 0.64893153 0.63172709]
mean value: 0.6436094526722355
key: test_fscore
value: [0.55813953 0.63541667 0.5862069 0.66666667 0.68571429 0.61797753
0.50867052 0.63157895 0.61780105 0.52760736]
mean value: 0.6035779455256195
key: train_fscore
value: [0.74844334 0.74129353 0.74842767 0.71690944 0.72407291 0.73697426
0.74229346 0.7404674 0.73817035 0.72795497]
mean value: 0.7365007340041063
key: test_precision
value: [0.65753425 0.65591398 0.68 0.75641026 0.78947368 0.6875
0.58666667 0.65934066 0.64130435 0.671875 ]
mean value: 0.6786018839524162
key: train_precision
value: [0.83821478 0.82892907 0.84878745 0.82028986 0.82051282 0.83499289
0.82240437 0.81682497 0.84051724 0.81971831]
mean value: 0.8291191750588291
key: test_recall
value: [0.48484848 0.61616162 0.51515152 0.5959596 0.60606061 0.56122449
0.44897959 0.60606061 0.5959596 0.43434343]
mean value: 0.5464749536178107
key: train_recall
value: [0.67604049 0.6704162 0.66929134 0.63667042 0.64791901 0.65955056
0.67640449 0.67716535 0.65804274 0.65466817]
mean value: 0.6626168779464365
key: test_accuracy
value: [0.76100629 0.77987421 0.77358491 0.81446541 0.82704403 0.78548896
0.7318612 0.77917981 0.76971609 0.75709779]
mean value: 0.7779318691347736
key: train_accuracy
value: [0.85859293 0.85439272 0.859993 0.84354218 0.84634232 0.85339398
0.85374388 0.8523443 0.85479356 0.84779566]
mean value: 0.8524934521743581
key: test_roc_auc
value: [0.68534662 0.73502145 0.70278124 0.7546008 0.76650062 0.72353462
0.65371354 0.73192939 0.72229172 0.66900658]
mean value: 0.7144726575721746
key: train_roc_auc
value: [0.8085487 0.8039581 0.80771477 0.78683114 0.79194731 0.80030374
0.80517379 0.80430132 0.80083448 0.79483027]
mean value: 0.8004443623506085
key: test_jcc
value: [0.38709677 0.46564885 0.41463415 0.5 0.52173913 0.44715447
0.34108527 0.46153846 0.4469697 0.35833333]
mean value: 0.4344200140635663
key: train_jcc
value: [0.59800995 0.58893281 0.59798995 0.55873643 0.56748768 0.58349901
0.59019608 0.58789062 0.585 0.57227139]
mean value: 0.5830013913333011
MCC on Blind test: 0.16
Accuracy on Blind test: 0.81
Running classifier: 7
Model_name: Gaussian NB
Model func: GaussianNB()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', GaussianNB())])
key: fit_time
value: [0.02263427 0.02391458 0.0249486 0.02442861 0.02409029 0.02448201
0.02420545 0.02391005 0.02600193 0.02408195]
mean value: 0.024269771575927735
key: score_time
value: [0.01029468 0.01053619 0.01077318 0.011096 0.01103663 0.01040053
0.01075554 0.01044893 0.01049995 0.010288 ]
mean value: 0.010612964630126953
key: test_mcc
value: [0.2756463 0.32627309 0.28489697 0.22926047 0.29718 0.21225853
0.32604259 0.26498751 0.29715342 0.24163991]
mean value: 0.27553387906510907
key: train_mcc
value: [0.2779604 0.27804532 0.28126335 0.27765365 0.27729577 0.28368129
0.27582058 0.28144837 0.27435818 0.27848889]
mean value: 0.2786015813998174
key: test_fscore
value: [0.53658537 0.568 0.54251012 0.51538462 0.55384615 0.49586777
0.56302521 0.53061224 0.55060729 0.51452282]
mean value: 0.5370961589145108
key: train_fscore
value: [0.53797468 0.53928571 0.54232804 0.53825383 0.53835801 0.54219031
0.53798034 0.54063763 0.53638814 0.53786848]
mean value: 0.539126516839614
key: test_precision
value: [0.44897959 0.47019868 0.4527027 0.41614907 0.44720497 0.41666667
0.47857143 0.44520548 0.45945946 0.43661972]
mean value: 0.4471757759762675
key: train_precision
value: [0.44973545 0.44707624 0.44597534 0.44861215 0.44776119 0.45142003
0.44658754 0.44992526 0.44652206 0.4506079 ]
mean value: 0.4484223176708898
key: test_recall
value: [0.66666667 0.71717172 0.67676768 0.67676768 0.72727273 0.6122449
0.68367347 0.65656566 0.68686869 0.62626263]
mean value: 0.6730261801690374
key: train_recall
value: [0.66929134 0.67941507 0.69178853 0.67266592 0.67491564 0.67865169
0.67640449 0.67716535 0.67154106 0.66704162]
mean value: 0.6758880701710039
key: test_accuracy
value: [0.64150943 0.66037736 0.64465409 0.60377358 0.63522013 0.61514196
0.67192429 0.63722397 0.64984227 0.63091483]
mean value: 0.6390581909806956
key: train_accuracy
value: [0.64228211 0.63878194 0.63668183 0.64088204 0.63983199 0.64310707
0.63820854 0.64205738 0.63890833 0.64345696]
mean value: 0.6404198201512593
key: test_roc_auc
value: [0.64840183 0.67593746 0.65345233 0.62377197 0.66044002 0.61434163
0.67517007 0.64250301 0.6599481 0.62964507]
mean value: 0.6483611483979534
key: train_roc_auc
value: [0.64968632 0.64992095 0.65178857 0.64959515 0.64944969 0.6528421
0.64866973 0.65168578 0.64785788 0.64992508]
mean value: 0.6501421247847838
key: test_jcc
value: [0.36666667 0.39664804 0.37222222 0.34715026 0.38297872 0.32967033
0.39181287 0.36111111 0.37988827 0.34636872]
mean value: 0.36745172055719794
key: train_jcc
value: [0.36796537 0.36919315 0.37205082 0.3682266 0.36832413 0.37192118
0.36797066 0.37046154 0.3664825 0.367866 ]
mean value: 0.3690461955353015
MCC on Blind test: 0.13
Accuracy on Blind test: 0.66
Running classifier: 8
Model_name: Gaussian Process
Model func: GaussianProcessClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', GaussianProcessClassifier(random_state=42))])
key: fit_time
value: [7.27414227 3.838907 3.61950493 3.82146931 3.62214065 3.50071144
3.40100431 3.41672587 3.64883542 3.53746629]
mean value: 3.9680907487869264
key: score_time
value: [0.10032892 0.09466887 0.11602378 0.09542131 0.09496522 0.09435749
0.09440303 0.09472942 0.09471536 0.10461044]
mean value: 0.09842238426208497
key: test_mcc
value: [0.30671567 0.37488995 0.22435927 0.25050271 0.29281953 0.25598683
0.24476527 0.32147586 0.29151339 0.19490704]
mean value: 0.2757935509805952
key: train_mcc
value: [0.61851291 0.62317132 0.62116114 0.61188236 0.60834135 0.61403601
0.61934409 0.61217102 0.60581604 0.62845522]
mean value: 0.616289146448969
key: test_fscore
value: [0.38235294 0.49350649 0.37333333 0.36619718 0.43137255 0.37762238
0.36619718 0.42758621 0.40277778 0.31654676]
mean value: 0.39374928081197236
key: train_fscore
value: [0.67194245 0.67902996 0.67480258 0.65785609 0.65938865 0.66858375
0.67146974 0.66857963 0.65345081 0.68191565]
mean value: 0.6687019309842211
key: test_precision
value: [0.7027027 0.69090909 0.54901961 0.60465116 0.61111111 0.6
0.59090909 0.67391304 0.64444444 0.55 ]
mean value: 0.6217660254188535
key: train_precision
value: [0.93213573 0.92787524 0.93253968 0.94714588 0.93402062 0.92814371
0.93574297 0.92277228 0.94080338 0.93529412]
mean value: 0.933647361268348
key: test_recall
value: [0.26262626 0.38383838 0.28282828 0.26262626 0.33333333 0.2755102
0.26530612 0.31313131 0.29292929 0.22222222]
mean value: 0.2894351680065966
key: train_recall
value: [0.52530934 0.53543307 0.52868391 0.50393701 0.5095613 0.52247191
0.52359551 0.52418448 0.50056243 0.53655793]
mean value: 0.5210296887046423
key: test_accuracy
value: [0.73584906 0.75471698 0.70440252 0.71698113 0.72641509 0.7192429
0.71608833 0.73817035 0.72870662 0.70031546]
mean value: 0.7240888439180206
key: train_accuracy
value: [0.84039202 0.84249212 0.84144207 0.83689184 0.83619181 0.83869839
0.84044787 0.8383485 0.83484955 0.84429671]
mean value: 0.8394050878191216
key: test_roc_auc
value: [0.60619898 0.65310641 0.58890273 0.59250035 0.61872146 0.59665921
0.59155717 0.62216199 0.6097674 0.56982671]
mean value: 0.6049402391078184
key: train_roc_auc
value: [0.75401646 0.75831613 0.75570375 0.74561688 0.74665057 0.75208961
0.75366767 0.75218873 0.74317101 0.75989908]
mean value: 0.7521319883361022
key: test_jcc
value: [0.23636364 0.32758621 0.2295082 0.22413793 0.275 0.23275862
0.22413793 0.27192982 0.25217391 0.18803419]
mean value: 0.246163044837919
key: train_jcc
value: [0.50595883 0.51403888 0.5092091 0.49015317 0.49185668 0.50215983
0.50542299 0.50215517 0.48527808 0.51735358]
mean value: 0.5023586310935345
MCC on Blind test: 0.15
Accuracy on Blind test: 0.83
Running classifier: 9
Model_name: K-Nearest Neighbors
Model func: KNeighborsClassifier()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', KNeighborsClassifier())])
key: fit_time
value: [0.02231169 0.01784849 0.01689744 0.01896119 0.01992321 0.01952839
0.0193851 0.0192852 0.01740861 0.01712728]
mean value: 0.01886765956878662
key: score_time
value: [0.14976525 0.03845787 0.02581477 0.02634478 0.02897167 0.02806425
0.02564049 0.02947783 0.02725768 0.02792573]
mean value: 0.04077203273773193
key: test_mcc
value: [0.21648847 0.19498215 0.16889461 0.09707466 0.23050272 0.12049911
0.15527277 0.14668695 0.25205388 0.13355999]
mean value: 0.17160153149825258
key: train_mcc
value: [0.47759668 0.4981481 0.50188 0.48960134 0.47759668 0.48903674
0.49831633 0.48375131 0.4793105 0.48632245]
mean value: 0.4881560137118385
key: test_fscore
value: [0.41420118 0.41573034 0.37349398 0.31707317 0.43023256 0.35087719
0.36363636 0.35365854 0.44444444 0.35502959]
mean value: 0.38183773487329076
key: train_fscore
value: [0.60739779 0.62411348 0.62717321 0.60863787 0.60739779 0.61618123
0.62362815 0.61538462 0.60330033 0.61498708]
mean value: 0.6148201551834488
key: test_precision
value: [0.5 0.46835443 0.46268657 0.4 0.50684932 0.4109589
0.44776119 0.44615385 0.52777778 0.42857143]
mean value: 0.4599113463254912
key: train_precision
value: [0.71779141 0.73111782 0.73343373 0.74350649 0.71779141 0.72671756
0.73292868 0.71535022 0.73003195 0.72230653]
mean value: 0.7270975809842041
key: test_recall
value: [0.35353535 0.37373737 0.31313131 0.26262626 0.37373737 0.30612245
0.30612245 0.29292929 0.38383838 0.3030303 ]
mean value: 0.326881055452484
key: train_recall
value: [0.5264342 0.54443195 0.54780652 0.5151856 0.5264342 0.53483146
0.54269663 0.53993251 0.51406074 0.53543307]
mean value: 0.5327246875039496
key: test_accuracy
value: [0.68867925 0.67295597 0.67295597 0.64779874 0.6918239 0.64984227
0.66876972 0.66561514 0.70031546 0.65615142]
mean value: 0.6714907842787136
key: train_accuracy
value: [0.78823941 0.7959398 0.79733987 0.79383969 0.78823941 0.79251225
0.7960112 0.79006298 0.78971309 0.79146256]
mean value: 0.7923360251287582
key: test_roc_auc
value: [0.596859 0.59097828 0.57437388 0.54227204 0.60467691 0.55488771
0.56858634 0.56389584 0.61393754 0.55977203]
mean value: 0.5770239553162047
key: train_roc_auc
value: [0.71646913 0.7269924 0.72893375 0.71745052 0.71646913 0.72193809
0.72663287 0.72146448 0.71411519 0.72124625]
mean value: 0.7211711794863453
key: test_jcc
value: [0.26119403 0.26241135 0.22962963 0.1884058 0.27407407 0.21276596
0.22222222 0.21481481 0.28571429 0.21582734]
mean value: 0.23670594965012573
key: train_jcc
value: [0.4361603 0.45360825 0.45684803 0.43744031 0.4361603 0.44527596
0.45309568 0.44444444 0.43194707 0.44402985]
mean value: 0.4439010188312159
MCC on Blind test: 0.08
Accuracy on Blind test: 0.76
Running classifier: 10
Model_name: LDA
Model func: LinearDiscriminantAnalysis()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', LinearDiscriminantAnalysis())])
key: fit_time
value: [0.13084149 0.10532331 0.11436629 0.11252785 0.11256433 0.11294079
0.11225128 0.11552978 0.11344838 0.11476493]
mean value: 0.11445584297180175
key: score_time
value: [0.0199275 0.01315784 0.0132935 0.01322722 0.01331925 0.01322865
0.01308274 0.01316428 0.01319218 0.01323509]
mean value: 0.013882827758789063
key: test_mcc
value: [0.44234757 0.45731285 0.44201592 0.44941651 0.51278339 0.41015062
0.33101086 0.42336178 0.37875687 0.24207798]
mean value: 0.4089234364412692
key: train_mcc
value: [0.48806132 0.49702093 0.48392003 0.47513942 0.46924098 0.4956723
0.50110167 0.4997578 0.48234701 0.49316022]
mean value: 0.48854216866951583
key: test_fscore
value: [0.5862069 0.62626263 0.59668508 0.58959538 0.64835165 0.55621302
0.51136364 0.58100559 0.55913978 0.43529412]
mean value: 0.569011777306206
key: train_fscore
value: [0.626401 0.63450835 0.62099309 0.61470773 0.60946372 0.63085572
0.63647643 0.63517713 0.62274705 0.62967581]
mean value: 0.62610060135097
key: test_precision
value: [0.68 0.62626263 0.65853659 0.68918919 0.71084337 0.66197183
0.57692308 0.65 0.59770115 0.52112676]
mean value: 0.6372554592209305
key: train_precision
value: [0.70153417 0.70467033 0.7037037 0.6965812 0.69396552 0.71026723
0.71052632 0.70972222 0.69583333 0.70629371]
mean value: 0.7033097724243333
key: test_recall
value: [0.51515152 0.62626263 0.54545455 0.51515152 0.5959596 0.47959184
0.45918367 0.52525253 0.52525253 0.37373737]
mean value: 0.5160997732426303
key: train_recall
value: [0.56580427 0.57705287 0.55568054 0.55005624 0.54330709 0.56741573
0.57640449 0.57480315 0.56355456 0.56805399]
mean value: 0.5642132935630236
key: test_accuracy
value: [0.77358491 0.7672956 0.77044025 0.77672956 0.79874214 0.76340694
0.72870662 0.76340694 0.74132492 0.69716088]
mean value: 0.7580798761978454
key: train_accuracy
value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
[0.7899895 0.79313966 0.78893945 0.78543927 0.78333917 0.79321204
0.79496151 0.79461162 0.78761372 0.79216235]
mean value: 0.7903408273982628
key: test_roc_auc
value: [0.70278124 0.72865643 0.70880033 0.70506434 0.74318528 0.6850014
0.65424937 0.69840608 0.68235103 0.60888704]
mean value: 0.6917382532586576
key: train_roc_auc
value: [0.72853222 0.73390245 0.72499474 0.72091227 0.71753769 0.73137047
0.73510265 0.73432895 0.72616529 0.73070043]
mean value: 0.728354716454946
key: test_jcc
value: [0.41463415 0.45588235 0.42519685 0.41803279 0.4796748 0.3852459
0.34351145 0.40944882 0.3880597 0.27819549]
mean value: 0.39978822944425574
key: train_jcc
value: [0.45602901 0.46467391 0.45031905 0.44373866 0.43829401 0.46076642
0.46678799 0.46539162 0.45216606 0.45950864]
mean value: 0.4557675387437937
MCC on Blind test: 0.21
Accuracy on Blind test: 0.81
Running classifier: 11
Model_name: Logistic Regression
Model func: LogisticRegression(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', LogisticRegression(random_state=42))])
key: fit_time
value: [0.07705808 0.06742597 0.06936812 0.06842589 0.06933832 0.06753731
0.06754708 0.06820035 0.0697844 0.06808066]
mean value: 0.06927661895751953
key: score_time
value: [0.01451421 0.01613545 0.01672888 0.01634574 0.0162847 0.0156045
0.01679635 0.01681685 0.01672053 0.0144577 ]
mean value: 0.016040492057800292
key: test_mcc
value: [0.38272586 0.4247393 0.42635002 0.49631889 0.56377951 0.40805741
0.34745918 0.32933631 0.37576339 0.30137094]
mean value: 0.4055900805108289
key: train_mcc
value: [0.46952888 0.45871184 0.46134627 0.45142742 0.45043235 0.45806692
0.48057079 0.47766929 0.45666003 0.47655688]
mean value: 0.46409706732179884
key: test_fscore
value: [0.5380117 0.59685864 0.57471264 0.61988304 0.6779661 0.5508982
0.53763441 0.51396648 0.55434783 0.46987952]
mean value: 0.5634158557759774
key: train_fscore
value: [0.61152882 0.60462211 0.60240964 0.59349075 0.59259259 0.60263653
0.62103298 0.61892902 0.60125786 0.61499685]
mean value: 0.6063497153807326
key: test_precision
value: [0.63888889 0.61956522 0.66666667 0.73611111 0.76923077 0.66666667
0.56818182 0.575 0.6 0.58208955]
mean value: 0.642240069037603
key: train_precision
value: [0.69024045 0.67977528 0.69040698 0.68584071 0.68537666 0.68278805
0.69595537 0.69316597 0.68188302 0.6991404 ]
mean value: 0.6884572895485778
key: test_recall
value: [0.46464646 0.57575758 0.50505051 0.53535354 0.60606061 0.46938776
0.51020408 0.46464646 0.51515152 0.39393939]
mean value: 0.5040197897340755
key: train_recall
value: [0.54893138 0.54443195 0.53430821 0.52305962 0.52193476 0.53932584
0.56067416 0.55905512 0.53768279 0.54893138]
mean value: 0.5418335208098988
key: test_accuracy
value: [0.75157233 0.75786164 0.7672956 0.79559748 0.82075472 0.76340694
0.72870662 0.72555205 0.74132492 0.72239748]
mean value: 0.7574469773624586
key: train_accuracy
value: [0.78298915 0.77843892 0.78053903 0.77703885 0.77668883 0.77851645
0.78691393 0.78586424 0.77816655 0.78621414]
mean value: 0.7811370082150015
key: test_roc_auc
value: [0.6729625 0.70797011 0.69544763 0.72429777 0.76193441 0.68218246
0.66834405 0.65434158 0.67959411 0.63274951]
mean value: 0.6879824138230932
key: train_roc_auc
value: [0.71882545 0.71428914 0.71303825 0.70741396 0.70685153 0.71300642
0.7249509 0.72366164 0.71221367 0.72113913]
mean value: 0.7155390073670076
key: test_jcc
value: [0.368 0.42537313 0.40322581 0.44915254 0.51282051 0.38016529
0.36764706 0.34586466 0.38345865 0.30708661]
mean value: 0.3942794266496998
key: train_jcc
value: [0.44043321 0.43330349 0.43103448 0.42196007 0.42105263 0.43126685
0.45036101 0.44815149 0.42985612 0.44404004]
mean value: 0.4351459387947335
MCC on Blind test: 0.21
Accuracy on Blind test: 0.81
Running classifier: 12
Model_name: Logistic RegressionCV
Model func: LogisticRegressionCV(cv=3, random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', LogisticRegressionCV(cv=3, random_state=42))])
key: fit_time
value: [1.03903151 0.85412979 0.95302415 0.85611701 0.846174 0.96966052
0.85819697 0.98818779 0.84459376 0.89470124]
mean value: 0.9103816747665405
key: score_time
value: [0.0142529 0.01379871 0.01375461 0.01368833 0.01371336 0.01374245
0.01383734 0.01373792 0.01377058 0.01366472]
mean value: 0.013796091079711914
key: test_mcc
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: train_mcc
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: test_fscore
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: train_fscore
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: test_precision
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: train_precision
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: test_recall
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: train_recall
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: test_accuracy
value: [0.68867925 0.68867925 0.68867925 0.68867925 0.68867925 0.69085174
0.69085174 0.68769716 0.68769716 0.68769716]
mean value: 0.6888191179096482
key: train_accuracy
value: [0.68883444 0.68883444 0.68883444 0.68883444 0.68883444 0.68859342
0.68859342 0.68894332 0.68894332 0.68894332]
mean value: 0.6888189003571943
key: test_roc_auc
value: [0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5]
mean value: 0.5
key: train_roc_auc
value: [0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5]
mean value: 0.5
key: test_jcc
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: train_jcc
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
MCC on Blind test: 0.0
Accuracy on Blind test: 0.85
Running classifier: 13
Model_name: MLP
Model func: MLPClassifier(max_iter=500, random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', MLPClassifier(max_iter=500, random_state=42))])
key: fit_time
value: [8.49480152 2.86519742 6.34546757 9.44811416 3.94816184 4.73981071
7.55361819 6.00784278 6.12239981 4.50449347]
mean value: 6.002990746498108
key: score_time
value: [0.0140295 0.01401949 0.01404476 0.01379061 0.01377058 0.01789975
0.01386952 0.01386666 0.01387525 0.01392317]
mean value: 0.014308929443359375
key: test_mcc
value: [0.34279462 0.4337567 0.44548428 0.38397675 0.51099862 0.39635301
0.28106552 0.32964312 0.433356 0.37282278]
mean value: 0.39302513979044884
key: train_mcc
value: [0.65110375 0.51909188 0.61718222 0.68045699 0.57344683 0.57573107
0.62969902 0.56766271 0.60712637 0.57153997]
mean value: 0.5993040809349456
key: test_fscore
value: [0.51977401 0.60913706 0.61538462 0.57575758 0.64444444 0.56830601
0.48044693 0.48148148 0.63114754 0.54945055]
mean value: 0.5675330212942535
key: train_fscore
value: [0.75295508 0.65452338 0.73636875 0.77971831 0.6937046 0.70175439
0.72359266 0.65853659 0.7370538 0.70340909]
mean value: 0.7141616636396158
key: test_precision
value: [0.58974359 0.6122449 0.625 0.57575758 0.71604938 0.61176471
0.5308642 0.61904762 0.53103448 0.60240964]
mean value: 0.6013916089950072
key: train_precision
value: [0.79327522 0.71108179 0.73595506 0.78103837 0.75098296 0.73170732
0.82778582 0.82793867 0.66636364 0.71067738]
mean value: 0.7536806229638839
key: test_recall
value: [0.46464646 0.60606061 0.60606061 0.57575758 0.58585859 0.53061224
0.43877551 0.39393939 0.77777778 0.50505051]
mean value: 0.5484539270253557
key: train_recall
value: [0.71653543 0.60629921 0.7367829 0.7784027 0.64454443 0.6741573
0.64269663 0.54668166 0.82452193 0.69628796]
mean value: 0.6866910175553899
key: test_accuracy
value: [0.7327044 0.75786164 0.76415094 0.73584906 0.79874214 0.75078864
0.70662461 0.73501577 0.71608833 0.74132492]
mean value: 0.7439150447394004
key: train_accuracy
value: [0.85369268 0.80084004 0.83584179 0.86314316 0.82289114 0.82155353
0.84709587 0.8236529 0.8170049 0.81735479]
mean value: 0.8303070821840602
key: test_roc_auc
value: [0.65926387 0.71627231 0.72083852 0.69198838 0.74041788 0.68996366
0.63262976 0.64192383 0.73292559 0.67683718]
mean value: 0.6903060971895242
key: train_roc_auc
value: [0.81609292 0.74750936 0.80868617 0.83991273 0.77399986 0.78118434
0.79111457 0.7476933 0.81906645 0.78415211]
mean value: 0.7909411812332767
key: test_jcc
value: [0.35114504 0.4379562 0.44444444 0.40425532 0.47540984 0.39694656
0.31617647 0.31707317 0.46107784 0.37878788]
mean value: 0.39832727715111504
key: train_jcc
value: [0.60379147 0.48646209 0.58274021 0.63896584 0.53104727 0.54054054
0.56689792 0.49090909 0.58359873 0.54250657]
mean value: 0.556745972768614
MCC on Blind test: 0.18
Accuracy on Blind test: 0.81
Running classifier: 14
Model_name: Multinomial
Model func: MultinomialNB()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', MultinomialNB())])
key: fit_time
value: [0.02449942 0.02457833 0.02449369 0.02459645 0.02475786 0.02442098
0.02442241 0.02433801 0.0456059 0.02570391]
mean value: 0.026741695404052735
key: score_time
value: [0.01339221 0.01355982 0.01327038 0.01335931 0.01334 0.01333904
0.01324701 0.0144403 0.01485491 0.0139935 ]
mean value: 0.01367964744567871
key: test_mcc
value: [0.23921949 0.29736313 0.19223133 0.15848888 0.25640625 0.08033908
0.20634419 0.26478608 0.16210141 0.19003604]
mean value: 0.20473158688555065
key: train_mcc
value: [0.21171866 0.21135828 0.20479648 0.2173008 0.2121115 0.22369518
0.22021477 0.20488553 0.21477921 0.21840834]
mean value: 0.21392687433566487
key: test_fscore
value: [0.48803828 0.52427184 0.45098039 0.44545455 0.50909091 0.37438424
0.45771144 0.51376147 0.43137255 0.4600939 ]
mean value: 0.4655159561736876
key: train_fscore
value: [0.46996279 0.4684492 0.47192169 0.47548761 0.46742057 0.48372093
0.47583643 0.46610169 0.46995708 0.4729802 ]
mean value: 0.4721838197532856
key: test_precision
value: [0.46363636 0.5046729 0.43809524 0.40495868 0.46280992 0.36190476
0.44660194 0.47058824 0.41904762 0.42982456]
mean value: 0.44021402133667664
key: train_precision
value: [0.44556452 0.44648318 0.43536122 0.44742063 0.44834711 0.44784689
0.45115811 0.44044044 0.44923077 0.45102041]
mean value: 0.4462873270179715
key: test_recall
value: [0.51515152 0.54545455 0.46464646 0.49494949 0.56565657 0.3877551
0.46938776 0.56565657 0.44444444 0.49494949]
mean value: 0.49480519480519475
key: train_recall
value: [0.49718785 0.49268841 0.5151856 0.50731159 0.48818898 0.5258427
0.50337079 0.49493813 0.49268841 0.49718785]
mean value: 0.5014590311042579
key: test_accuracy
value: [0.66352201 0.6918239 0.64779874 0.6163522 0.66037736 0.59936909
0.65615142 0.66561514 0.6340694 0.63722397]
mean value: 0.6472303235918497
key: train_accuracy
value: [0.65103255 0.6520826 0.64123206 0.65173259 0.65383269 0.65045486
0.6546536 0.64730581 0.65430371 0.65535339]
mean value: 0.6511983874211204
key: test_roc_auc
value: [0.62287256 0.65172271 0.59762004 0.58309119 0.63442646 0.54090951
0.60455689 0.63833287 0.58231397 0.59839218]
mean value: 0.605423836563085
key: train_roc_auc
value: [0.60885815 0.60838689 0.60667817 0.61214157 0.60842376 0.61632582
0.61321995 0.60551884 0.60998057 0.61197635]
mean value: 0.6101510059706898
key: test_jcc
value: [0.32278481 0.35526316 0.29113924 0.28654971 0.34146341 0.23030303
0.29677419 0.34567901 0.275 0.29878049]
mean value: 0.3043737054766108
key: train_jcc
value: [0.30715775 0.30586592 0.30883345 0.31189488 0.30498946 0.3190184
0.31219512 0.3038674 0.30715288 0.30974071]
mean value: 0.30907159774019266
MCC on Blind test: 0.05
Accuracy on Blind test: 0.72
Running classifier: 15
Model_name: Naive Bayes
Model func: BernoulliNB()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', BernoulliNB())])
key: fit_time
value: [0.02841139 0.02749515 0.02763081 0.02747178 0.02739096 0.02754045
0.02740359 0.0273087 0.02756333 0.02737761]
mean value: 0.027559375762939452
key: score_time
value: [0.01392746 0.01385331 0.01381564 0.01380348 0.01384377 0.01377249
0.01390028 0.01376152 0.0138545 0.01383042]
mean value: 0.013836288452148437
key: test_mcc
value: [0.18595732 0.24601459 0.09519898 0.1445823 0.13839397 0.04004701
0.09337322 0.15695629 0.1839646 0.21412678]
mean value: 0.1498615059326042
key: train_mcc
value: [0.18787624 0.16637133 0.19409606 0.17262372 0.18593902 0.19798825
0.18886662 0.18776326 0.17412022 0.18671254]
mean value: 0.18423572623839357
key: test_fscore
value: [0.42553191 0.45555556 0.35675676 0.40816327 0.36781609 0.29885057
0.35675676 0.39106145 0.40677966 0.40243902]
mean value: 0.3869711053856634
key: train_fscore
value: [0.41783751 0.38664098 0.43483343 0.41498216 0.3923634 0.43822844
0.41293532 0.41620626 0.38283828 0.39687703]
mean value: 0.4093742815808145
key: test_precision
value: [0.4494382 0.50617284 0.38372093 0.41237113 0.42666667 0.34210526
0.37931034 0.4375 0.46153846 0.50769231]
mean value: 0.4306516149889458
key: train_precision
value: [0.45721925 0.4505988 0.45255474 0.44010088 0.47301587 0.45520581
0.46239554 0.45810811 0.46325879 0.47067901]
mean value: 0.45831368147071433
key: test_recall
value: [0.4040404 0.41414141 0.33333333 0.4040404 0.32323232 0.26530612
0.33673469 0.35353535 0.36363636 0.33333333]
mean value: 0.353133374561946
key: train_recall
value: [0.38470191 0.33858268 0.41844769 0.39257593 0.3352081 0.42247191
0.37303371 0.38132733 0.32620922 0.34308211]
mean value: 0.3715640601104638
key: test_accuracy
value: [0.66037736 0.6918239 0.62578616 0.63522013 0.65408805 0.61514196
0.62460568 0.65615142 0.66876972 0.69085174]
mean value: 0.6522816102216138
key: train_accuracy
value: [0.66643332 0.66573329 0.66153308 0.65558278 0.67693385 0.66270119
0.66969909 0.66724983 0.67284815 0.67564731]
mean value: 0.6674361867148642
key: test_roc_auc
value: [0.59014806 0.61574651 0.5456621 0.57188322 0.56344265 0.51849781
0.54507968 0.57355667 0.58548791 0.59327217]
mean value: 0.5702776755093673
key: train_roc_auc
value: [0.58920055 0.57604947 0.59489458 0.58348309 0.58325446 0.59690669
0.58844775 0.58883533 0.57778211 0.58444101]
mean value: 0.5863295021532389
key: test_jcc
value: [0.27027027 0.29496403 0.21710526 0.25641026 0.22535211 0.17567568
0.21710526 0.24305556 0.25531915 0.2519084 ]
mean value: 0.2407165971563317
key: train_jcc
value: [0.26409266 0.23964968 0.27781927 0.26181545 0.24406224 0.28059701
0.26018809 0.2627907 0.23673469 0.24756494]
mean value: 0.25753147409741395
MCC on Blind test: 0.07
Accuracy on Blind test: 0.76
Running classifier: 16
Model_name: Passive Aggresive
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model',
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
key: fit_time
value: [0.05465412 0.04172897 0.04354453 0.05603123 0.05251122 0.05749369
0.04883218 0.06091332 0.04554868 0.05209565]
mean value: 0.05133535861968994
key: score_time
value: [0.01188397 0.01327181 0.01328802 0.01329947 0.01334739 0.01330185
0.01325083 0.01329541 0.01332855 0.01336432]
mean value: 0.013163161277770997
key: test_mcc
value: [0.30317195 0.38127671 0.08427712 0.30548877 0.42721606 0.35960495
0.08409409 0.39112936 0.40194579 0.33771079]
mean value: 0.30759155857392295
key: train_mcc
value: [0.29134324 0.45041674 0.16825767 0.27291552 0.38402831 0.45170607
0.15786828 0.4470723 0.33530218 0.41642114]
mean value: 0.3375331465052097
key: test_fscore
value: [0.55522388 0.60079051 0.0754717 0.55828221 0.62283737 0.57534247
0.02020202 0.56521739 0.60450161 0.57377049]
mean value: 0.4751639648155499
key: train_fscore
value: [0.5496384 0.63950734 0.14387031 0.54381271 0.60147458 0.63918526
0.07956989 0.60411622 0.57442197 0.62112855]
mean value: 0.49967252350743935
key: test_precision
value: [0.3940678 0.49350649 0.57142857 0.40088106 0.47368421 0.52066116
1. 0.61176471 0.44339623 0.48275862]
mean value: 0.5392148839352169
key: train_precision
value: [0.3882954 0.55237316 0.7244898 0.38695859 0.45912322 0.56228669
0.925 0.65399738 0.42309739 0.4986376 ]
mean value: 0.5574259232932934
key: test_recall
value: [0.93939394 0.76767677 0.04040404 0.91919192 0.90909091 0.64285714
0.01020408 0.52525253 0.94949495 0.70707071]
mean value: 0.6410636982065554
key: train_recall
value: [0.94038245 0.75928009 0.07986502 0.91451069 0.87176603 0.74044944
0.04157303 0.56130484 0.89426322 0.82339708]
mean value: 0.6626791875734634
key: test_accuracy
value: [0.53144654 0.68238994 0.6918239 0.54716981 0.6572327 0.70662461
0.69400631 0.74763407 0.61198738 0.67192429]
mean value: 0.654223954923318
key: train_accuracy
value: [0.52047602 0.73363668 0.70423521 0.52257613 0.64053203 0.7396781
0.70048985 0.77116865 0.58782365 0.68754374]
mean value: 0.6608160061607
key: test_roc_auc
value: [0.64321295 0.70575619 0.51335271 0.64909368 0.72623495 0.68900848
0.50510204 0.68693819 0.70410527 0.681517 ]
mean value: 0.6504321465353082
key: train_roc_auc
value: [0.63558757 0.74066647 0.53307275 0.63001957 0.70392163 0.73988935
0.52002432 0.71361331 0.67186498 0.72480164]
mean value: 0.6613461589615759
key: test_jcc
value: [0.38429752 0.42937853 0.03921569 0.38723404 0.45226131 0.40384615
0.01020408 0.39393939 0.43317972 0.40229885]
mean value: 0.33358552905901856
key: train_jcc
value: [0.37896646 0.47005571 0.07751092 0.3734497 0.43007769 0.46970777
0.04143337 0.43278404 0.40293969 0.45046154]
mean value: 0.35273868809140635
MCC on Blind test: 0.16
Accuracy on Blind test: 0.8
Running classifier: 17
Model_name: QDA
Model func: QuadraticDiscriminantAnalysis()
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', QuadraticDiscriminantAnalysis())])
key: fit_time
value: [0.07634377 0.06435227 0.0652914 0.06445885 0.06467891 0.06503415
0.06511903 0.06447959 0.06440616 0.0650084 ]
mean value: 0.0659172534942627
key: score_time
value: [0.01460218 0.01459146 0.019032 0.01785922 0.01765656 0.01778483
0.01801515 0.01859283 0.01809716 0.0180037 ]
mean value: 0.017423510551452637
key: test_mcc
value: [0.09940296 0.05349189 0.05349189 0.11218048 0.06464171 0.08167878
0.10514085 0.1336688 0.02666625 0.08267992]
mean value: 0.08130435331381713
key: train_mcc
value: [0.12800294 0.13449475 0.1286647 0.13950293 0.13193073 0.13068242
0.12537364 0.12794723 0.1350704 0.13057581]
mean value: 0.13122455512283748
key: test_fscore
value: [0.485 0.47761194 0.47761194 0.48743719 0.4792176 0.47901235
0.4836272 0.49009901 0.47524752 0.48275862]
mean value: 0.4817623375491003
key: train_fscore
value: [0.48765771 0.4889989 0.4877915 0.49007718 0.48846154 0.4884742
0.48740416 0.48752399 0.4889989 0.48805929]
mean value: 0.48834473684140695
key: test_precision
value: [0.32225914 0.31683168 0.31683168 0.32441472 0.31612903 0.31596091
0.32107023 0.32459016 0.3147541 0.31921824]
mean value: 0.31920599000296435
key: train_precision
value: [0.32245194 0.32362577 0.32256894 0.32457101 0.32315522 0.3231663
0.32223027 0.32233503 0.32362577 0.3228032 ]
mean value: 0.3230533447967027
key: test_recall
value: [0.97979798 0.96969697 0.96969697 0.97979798 0.98989899 0.98979592
0.97959184 1. 0.96969697 0.98989899]
mean value: 0.9817872603586888
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.35220126 0.33962264 0.33962264 0.35849057 0.33018868 0.33438486
0.3533123 0.35015773 0.33123028 0.33753943]
mean value: 0.3426750391841756
key: train_accuracy
value: [0.34616731 0.34966748 0.34651733 0.35246762 0.34826741 0.34779566
0.3449965 0.34604619 0.34989503 0.34744577]
mean value: 0.3479266300613841
key: test_roc_auc
value: [0.52414557 0.51224575 0.51224575 0.52871178 0.51093123 0.5154459
0.5263256 0.52752294 0.50549069 0.5155917 ]
mean value: 0.5178656883252118
key: train_roc_auc
value: [0.5254065 0.52794715 0.52566057 0.52997967 0.52693089 0.52642276
0.52439024 0.5253936 0.5281869 0.52640934]
mean value: 0.5266727647437723
key: test_jcc
value: [0.32013201 0.31372549 0.31372549 0.32225914 0.31511254 0.31493506
0.31893688 0.32459016 0.31168831 0.31818182]
mean value: 0.31732869058150615
key: train_jcc
value: [0.32245194 0.32362577 0.32256894 0.32457101 0.32315522 0.3231663
0.32223027 0.32233503 0.32362577 0.3228032 ]
mean value: 0.3230533447967027
MCC on Blind test: 0.05
Accuracy on Blind test: 0.18
Running classifier: 18
Model_name: Random Forest
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model',
RandomForestClassifier(n_estimators=1000, random_state=42))])
key: fit_time
value: [8.66789865 8.94918489 8.65462756 8.68244672 9.17352557 9.00306129
8.85187078 8.73806477 8.63968658 8.71503758]
mean value: 8.807540440559388
key: score_time
value: [0.14088345 0.13755059 0.13763165 0.14103055 0.14252877 0.1412437
0.13854456 0.13479519 0.13691473 0.13670754]
mean value: 0.13878307342529297
key: test_mcc
value: [0.38910015 0.4443974 0.37584284 0.48368589 0.4549412 0.45370177
0.35344751 0.38441747 0.46660178 0.34030004]
mean value: 0.41464360610308776
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.52173913 0.6010929 0.53488372 0.59259259 0.58823529 0.57142857
0.50306748 0.5433526 0.60571429 0.47435897]
mean value: 0.5536465551570596
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.67741935 0.6547619 0.63013699 0.76190476 0.70422535 0.73015873
0.63076923 0.63513514 0.69736842 0.64912281]
mean value: 0.6771002684052694
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.42424242 0.55555556 0.46464646 0.48484848 0.50505051 0.46938776
0.41836735 0.47474747 0.53535354 0.37373737]
mean value: 0.4705936920222634
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.75786164 0.77044025 0.74842767 0.79245283 0.77987421 0.78233438
0.7444795 0.75078864 0.78233438 0.74132492]
mean value: 0.7650318433426582
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: [0.66645911 0.71156773 0.6706794 0.70817767 0.70458005 0.69588109
0.65438915 0.67544713 0.71492447 0.64099713]
mean value: 0.6843102932902569
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.35294118 0.4296875 0.36507937 0.42105263 0.41666667 0.4
0.33606557 0.37301587 0.43442623 0.31092437]
mean value: 0.3839859385838028
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: 0.22
Accuracy on Blind test: 0.84
Running classifier: 19
Model_name: Random Forest2
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
n_estimators=1000, n_jobs=10, oob_score=True,
random_state=42)
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model',
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
n_estimators=1000, n_jobs=10,
oob_score=True, random_state=42))])
key: fit_time
value: [1.93243837 1.93650174 1.96072531 2.08350563 2.00067401 2.07300305
2.0219152 2.02988553 2.01707101 2.09575558]
mean value: 2.0151475429534913
key: score_time
value: [0.34585857 0.31771135 0.3232193 0.2915175 0.19792581 0.25355792
0.35720253 0.32670546 0.39465094 0.3022666 ]
mean value: 0.31106159687042234
key: test_mcc
value: [0.42108769 0.4285015 0.40601618 0.47384809 0.45862387 0.46024739
0.370273 0.40333091 0.45771601 0.33481667]
mean value: 0.42144613018572663
key: train_mcc
value: [0.83006155 0.82389172 0.81267983 0.81470372 0.81313434 0.81738756
0.82372571 0.82380597 0.81433781 0.81106848]
mean value: 0.8184796685314645
key: test_fscore
value: [0.53503185 0.57954545 0.55294118 0.57324841 0.57668712 0.56410256
0.49350649 0.54761905 0.59770115 0.45333333]
mean value: 0.5473716590344255
key: train_fscore
value: [0.8701623 0.86587648 0.8555205 0.85641999 0.85678392 0.86053784
0.86654252 0.86604361 0.85696282 0.85407454]
mean value: 0.8608924525378988
key: test_precision
value: [0.72413793 0.66233766 0.66197183 0.77586207 0.734375 0.75862069
0.67857143 0.66666667 0.69333333 0.66666667]
mean value: 0.7022543278216846
key: train_precision
value: [0.97755961 0.9719888 0.97413793 0.9783237 0.97012802 0.97038082
0.96809986 0.97067039 0.97421203 0.9740634 ]
mean value: 0.9729564561406503
key: test_recall
value: [0.42424242 0.51515152 0.47474747 0.45454545 0.47474747 0.44897959
0.3877551 0.46464646 0.52525253 0.34343434]
mean value: 0.4513502370645227
key: train_recall
value: [0.784027 0.78065242 0.76265467 0.76152981 0.76715411 0.77303371
0.78426966 0.78177728 0.76490439 0.76040495]
mean value: 0.7720407982710028
key: test_accuracy
value: [0.77044025 0.7672956 0.76100629 0.78930818 0.78301887 0.78548896
0.75394322 0.76025237 0.77917981 0.74132492]
mean value: 0.7691258456837886
key: train_accuracy
value: [0.92719636 0.92474624 0.91984599 0.92054603 0.92019601 0.92197341
0.92477257 0.92477257 0.92057383 0.91917425]
mean value: 0.9223797246545322
key: test_roc_auc
value: [0.67559153 0.69821503 0.68257922 0.69759236 0.69856095 0.69252633
0.65278166 0.67957094 0.70987397 0.63272635]
mean value: 0.6820018329556203
key: train_roc_auc
value: [0.88794846 0.88524491 0.87675416 0.87695393 0.87824169 0.88118149
0.88629134 0.88555598 0.87788135 0.87563163]
mean value: 0.881168492344468
key: test_jcc
value: [0.36521739 0.408 0.38211382 0.40178571 0.40517241 0.39285714
0.32758621 0.37704918 0.42622951 0.29310345]
mean value: 0.37791148270755237
key: train_jcc
value: [0.77016575 0.76347635 0.74751929 0.74889381 0.74945055 0.75521405
0.7645126 0.76373626 0.74972437 0.74531422]
mean value: 0.7558007241450726
MCC on Blind test: 0.2
Accuracy on Blind test: 0.84
Running classifier: 20
Model_name: Ridge Classifier
Model func: RidgeClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', RidgeClassifier(random_state=42))])
key: fit_time
value: [0.06322074 0.03930569 0.03945494 0.03968906 0.03958464 0.04876328
0.03991079 0.03984761 0.03864717 0.03879642]
mean value: 0.042722034454345706
key: score_time
value: [0.02866697 0.02100873 0.02131486 0.02120638 0.02537441 0.0210011
0.02350616 0.02104974 0.02082944 0.02100849]
mean value: 0.022496628761291503
key: test_mcc
value: [0.40204774 0.43693228 0.40352749 0.46235752 0.52233536 0.40805741
0.3636966 0.38902446 0.3802039 0.28849237]
mean value: 0.4056675149453798
key: train_mcc
value: [0.47190008 0.47548947 0.48291078 0.45605383 0.45252749 0.48168539
0.48088248 0.47983223 0.47053213 0.48811691]
mean value: 0.4739930800008775
key: test_fscore
value: [0.54216867 0.6031746 0.56 0.59171598 0.64367816 0.5508982
0.52631579 0.54117647 0.54545455 0.45398773]
mean value: 0.5558570154294928
key: train_fscore
value: [0.60465116 0.60723514 0.60863874 0.59007833 0.58568615 0.61049903
0.61221865 0.61182519 0.60438144 0.61627907]
mean value: 0.6051492906365661
key: test_precision
value: [0.67164179 0.63333333 0.64473684 0.71428571 0.74666667 0.66666667
0.61643836 0.64788732 0.62337662 0.578125 ]
mean value: 0.6543158317587089
key: train_precision
value: [0.71016692 0.71320182 0.72769953 0.7029549 0.70347003 0.72128637
0.71578947 0.71364318 0.70739065 0.72382398]
mean value: 0.7139426848469617
key: test_recall
value: [0.45454545 0.57575758 0.49494949 0.50505051 0.56565657 0.46938776
0.45918367 0.46464646 0.48484848 0.37373737]
mean value: 0.48477633477633475
key: train_recall
value: [0.5264342 0.52868391 0.52305962 0.50843645 0.50168729 0.52921348
0.53483146 0.53543307 0.52755906 0.53655793]
mean value: 0.5251896462380404
key: test_accuracy
value: [0.76100629 0.76415094 0.75786164 0.78301887 0.80503145 0.76340694
0.7444795 0.75394322 0.74763407 0.7192429 ]
mean value: 0.7599775806995616
key: train_accuracy
value: [0.78578929 0.78718936 0.79068953 0.78018901 0.77913896 0.78971309
0.7890133 0.7886634 0.78516445 0.79216235]
mean value: 0.7867712734831983
key: test_roc_auc
value: [0.67704442 0.71253632 0.68583091 0.70686315 0.73944929 0.68218246
0.6656649 0.67498378 0.67591048 0.62494208]
mean value: 0.6845407793440971
key: train_roc_auc
value: [0.71469068 0.71632366 0.71732249 0.7056918 0.70307942 0.7183669
0.71939744 0.71921476 0.71451594 0.72206261]
mean value: 0.7150665701574657
key: test_jcc
value: [0.37190083 0.43181818 0.38888889 0.42016807 0.47457627 0.38016529
0.35714286 0.37096774 0.375 0.29365079]
mean value: 0.3864278917552016
key: train_jcc
value: [0.43333333 0.43599258 0.4374412 0.41851852 0.41411328 0.43936567
0.44114921 0.44074074 0.43305633 0.44537815]
mean value: 0.43390890133634985
MCC on Blind test: 0.22
Accuracy on Blind test: 0.82
Running classifier: 21
Model_name: Ridge ClassifierCV
Model func: RidgeClassifierCV(cv=3)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', RidgeClassifierCV(cv=3))])
key: fit_time
value: [0.2034409 0.21162343 0.21699476 0.2050066 0.20651197 0.19732952
0.20593381 0.20252585 0.19415426 0.20114136]
mean value: 0.20446624755859374
key: score_time
value: [0.02542281 0.02118587 0.02138638 0.02077341 0.02126265 0.0261662
0.0212388 0.02078891 0.02143002 0.02076817]
mean value: 0.022042322158813476
key: test_mcc
value: [0.3615934 0.42899317 0.41318371 0.51066864 0.55348076 0.42459887
0.38679409 0.33328719 0.39607703 0.27846933]
mean value: 0.408714619118937
key: train_mcc
value: [0.46162458 0.45582364 0.45950036 0.44132711 0.44330068 0.45534789
0.47530352 0.46813988 0.45406682 0.47571812]
mean value: 0.45901526189679637
key: test_fscore
value: [0.51497006 0.59016393 0.55621302 0.62275449 0.6627907 0.56287425
0.54117647 0.50292398 0.55681818 0.44444444]
mean value: 0.5555129525706386
key: train_fscore
value: [0.59225213 0.58722844 0.58753316 0.57483444 0.57428381 0.58684211
0.60570687 0.59934853 0.58591178 0.60340314]
mean value: 0.5897344417170932
key: test_precision
value: [0.63235294 0.64285714 0.67142857 0.76470588 0.78082192 0.68115942
0.63888889 0.59722222 0.63636364 0.57142857]
mean value: 0.6617229194816519
key: train_precision
value: [0.71135647 0.70793651 0.71567044 0.69887279 0.70424837 0.70793651
0.71625767 0.7120743 0.70634921 0.72143975]
mean value: 0.7102141998854965
key: test_recall
value: [0.43434343 0.54545455 0.47474747 0.52525253 0.57575758 0.47959184
0.46938776 0.43434343 0.49494949 0.36363636]
mean value: 0.47974644403215827
key: train_recall
value: [0.50731159 0.50168729 0.49831271 0.48818898 0.4848144 0.5011236
0.5247191 0.51743532 0.50056243 0.51856018]
mean value: 0.5042715587517852
key: test_accuracy
value: [0.74528302 0.76415094 0.76415094 0.80188679 0.81761006 0.76971609
0.75394322 0.7318612 0.75394322 0.71608833]
mean value: 0.7618633811479475
key: train_accuracy
value: [0.78263913 0.78053903 0.78228911 0.77528876 0.77633882 0.78026592
0.78726382 0.78481456 0.77991603 0.78796361]
mean value: 0.781731878756289
key: test_roc_auc
value: [0.66009409 0.70423412 0.68486232 0.72609658 0.75134911 0.68956761
0.67533315 0.65065796 0.68325456 0.61989158]
mean value: 0.6845341076094771
key: train_roc_auc
value: [0.70716189 0.70409568 0.70444091 0.69658433 0.69642143 0.70381383
0.71535752 0.71148556 0.70330305 0.71407948]
mean value: 0.7056743673265259
key: test_jcc
value: [0.34677419 0.41860465 0.3852459 0.45217391 0.49565217 0.39166667
0.37096774 0.3359375 0.38582677 0.28571429]
mean value: 0.38685637992770233
key: train_jcc
value: [0.42070896 0.41565704 0.41596244 0.40334572 0.40280374 0.41527002
0.4344186 0.42790698 0.41433892 0.43205248]
mean value: 0.41824648996516567
MCC on Blind test: 0.22
Accuracy on Blind test: 0.82
Running classifier: 22
Model_name: SVC
Model func: SVC(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', SVC(random_state=42))])
key: fit_time
value: [0.48223209 0.48226428 0.45203614 0.45159507 0.46000671 0.44421244
0.44200492 0.44053578 0.44895625 0.43792653]
mean value: 0.45417702198028564
key: score_time
value: [0.11681008 0.10453534 0.11233926 0.10459447 0.11513567 0.09498525
0.09998918 0.11367464 0.10377026 0.09725881]
mean value: 0.10630929470062256
key: test_mcc
value: [0.41080349 0.40398353 0.37963379 0.41854393 0.41888558 0.31820233
0.31900981 0.37533592 0.36586639 0.3464829 ]
mean value: 0.3756747669623087
key: train_mcc
value: [0.47123718 0.47123718 0.48497423 0.46436737 0.45730556 0.47725504
0.48555224 0.48647145 0.47569875 0.4883412 ]
mean value: 0.4762440198973249
key: test_fscore
value: [0.51948052 0.54761905 0.5125 0.51006711 0.51655629 0.43537415
0.46153846 0.52121212 0.51219512 0.47058824]
mean value: 0.5007131062240038
key: train_fscore
value: [0.57780879 0.57780879 0.59043659 0.55756698 0.55021834 0.57905833
0.59156876 0.59533608 0.57627119 0.58457183]
mean value: 0.5780645683654608
key: test_precision
value: [0.72727273 0.66666667 0.67213115 0.76 0.75 0.65306122
0.62068966 0.65151515 0.64615385 0.66666667]
mean value: 0.6814157085478252
key: train_precision
value: [0.76102941 0.76102941 0.76895307 0.78252033 0.77938144 0.77298311
0.76840215 0.76274165 0.77419355 0.78816794]
mean value: 0.7719402068808268
key: test_recall
value: [0.4040404 0.46464646 0.41414141 0.38383838 0.39393939 0.32653061
0.36734694 0.43434343 0.42424242 0.36363636]
mean value: 0.39767058338486916
key: train_recall
value: [0.46569179 0.46569179 0.4791901 0.43307087 0.42519685 0.46292135
0.48089888 0.48818898 0.45894263 0.46456693]
mean value: 0.4624360157227538
key: test_accuracy
value: [0.7672956 0.76100629 0.75471698 0.77044025 0.77044025 0.73817035
0.73501577 0.75078864 0.74763407 0.7444795 ]
mean value: 0.7539987699144892
key: train_accuracy
value: [0.78823941 0.78823941 0.79313966 0.78613931 0.78368918 0.79041288
0.79321204 0.79356193 0.79006298 0.79461162]
mean value: 0.7901308413916148
key: test_roc_auc
value: [0.66777363 0.67981182 0.66140861 0.66452193 0.66728933 0.62445252
0.63344516 0.66441942 0.65936892 0.64053378]
mean value: 0.656302511331327
key: train_roc_auc
value: [0.69981744 0.69981744 0.70707473 0.68935047 0.68541347 0.7007188
0.70767505 0.70981313 0.69925293 0.70409657]
mean value: 0.700303002101365
key: test_jcc
value: [0.35087719 0.37704918 0.34453782 0.34234234 0.34821429 0.27826087
0.3 0.35245902 0.3442623 0.30769231]
mean value: 0.3345695305225938
key: train_jcc
value: [0.40628067 0.40628067 0.41887906 0.38654618 0.37951807 0.40751731
0.42001963 0.42382812 0.4047619 0.413 ]
mean value: 0.4066631614158859
MCC on Blind test: 0.21
Accuracy on Blind test: 0.82
Running classifier: 23
Model_name: Stochastic GDescent
Model func: SGDClassifier(n_jobs=10, random_state=42)
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
from pandas import MultiIndex, Int64Index
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
from pandas import MultiIndex, Int64Index
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
from pandas import MultiIndex, Int64Index
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
from pandas import MultiIndex, Int64Index
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
from pandas import MultiIndex, Int64Index
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
from pandas import MultiIndex, Int64Index
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
from pandas import MultiIndex, Int64Index
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
from pandas import MultiIndex, Int64Index
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
from pandas import MultiIndex, Int64Index
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
from pandas import MultiIndex, Int64Index
/home/tanu/git/LSHTM_analysis/scripts/ml/ml_functions/MultClfs_logo_skf.py:419: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
scoresDF_CV['source_data'] = 'CV'
/home/tanu/git/LSHTM_analysis/scripts/ml/ml_functions/MultClfs_logo_skf.py:446: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
scoresDF_BT['source_data'] = 'BT'
Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', SGDClassifier(n_jobs=10, random_state=42))])
key: fit_time
value: [0.05907893 0.11263537 0.09513521 0.09723878 0.08813548 0.13504672
0.11016369 0.11991549 0.09340215 0.11309052]
mean value: 0.10238423347473144
key: score_time
value: [0.01132107 0.01176143 0.01108837 0.01161647 0.01148009 0.01119804
0.0114851 0.01135135 0.01182675 0.019835 ]
mean value: 0.012296366691589355
key: test_mcc
value: [0.27068645 0.46352883 0.33876177 0.38215946 0.46156834 0.25619115
0.27659715 0.37575029 0.43335553 0.27238389]
mean value: 0.3530982873884624
key: train_mcc
value: [0.32203518 0.45660986 0.35674382 0.39337638 0.4156987 0.33732199
0.38705591 0.43933725 0.43834846 0.46690525]
mean value: 0.40134328117142193
key: test_fscore
value: [0.29268293 0.625 0.57605178 0.49006623 0.64468864 0.53674121
0.54607509 0.5 0.6302521 0.43037975]
mean value: 0.527193772367627
key: train_fscore
value: [0.33571429 0.61204819 0.58513365 0.5010989 0.6200409 0.57123381
0.60197119 0.54790632 0.62961141 0.58996656]
mean value: 0.5594725214277807
key: test_precision
value: [0.75 0.64516129 0.42380952 0.71153846 0.50574713 0.39069767
0.41025641 0.68421053 0.53956835 0.57627119]
mean value: 0.5637260544862571
key: train_precision
value: [0.81385281 0.65888457 0.43376764 0.71848739 0.48714653 0.40998043
0.45423341 0.74230769 0.55944056 0.72772277]
mean value: 0.6005823811903334
key: test_recall
value: [0.18181818 0.60606061 0.8989899 0.37373737 0.88888889 0.85714286
0.81632653 0.39393939 0.75757576 0.34343434]
mean value: 0.6117913832199547
key: train_recall
value: [0.21147357 0.57142857 0.89876265 0.38470191 0.85264342 0.94157303
0.89213483 0.43419573 0.71991001 0.49606299]
mean value: 0.6402886717811959
key: test_accuracy
value: [0.72641509 0.77358491 0.58805031 0.75786164 0.69496855 0.54258675
0.58044164 0.75394322 0.72239748 0.71608833]
mean value: 0.6856337916393865
key: train_accuracy
value: [0.73958698 0.77458873 0.60343017 0.76163808 0.67483374 0.55983205
0.63261022 0.77711686 0.73652904 0.78551435]
mean value: 0.704568022312942
key: test_roc_auc
value: [0.57721046 0.72768784 0.67323924 0.65262211 0.74809741 0.62948467
0.64560619 0.65568529 0.73199889 0.61437772]
mean value: 0.6656009825487734
key: train_roc_auc
value: [0.59481199 0.71889518 0.68439149 0.65830624 0.72357781 0.66438408
0.70368937 0.68307044 0.73197126 0.70613205]
mean value: 0.6869229898852385
key: test_jcc
value: [0.17142857 0.45454545 0.40454545 0.3245614 0.47567568 0.36681223
0.37558685 0.33333333 0.4601227 0.27419355]
mean value: 0.36408052223451903
key: train_jcc
value: [0.20171674 0.44097222 0.41356108 0.33431085 0.44931832 0.39980916
0.43058568 0.3773216 0.45944006 0.41840607]
mean value: 0.39254417802691655
MCC on Blind test: 0.27
Accuracy on Blind test: 0.81
Running classifier: 24
Model_name: XGBoost
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
colsample_bynode=None, colsample_bytree=None,
enable_categorical=False, gamma=None, gpu_id=None,
importance_type=None, interaction_constraints=None,
learning_rate=None, max_delta_step=None, max_depth=None,
min_child_weight=None, missing=nan, monotone_constraints=None,
n_estimators=100, n_jobs=None, num_parallel_tree=None,
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
scale_pos_weight=None, subsample=None, tree_method=None,
use_label_encoder=False, validate_parameters=None, verbosity=0)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', '...
interaction_constraints=None, learning_rate=None,
max_delta_step=None, max_depth=None,
min_child_weight=None, missing=nan,
monotone_constraints=None, n_estimators=100,
n_jobs=None, num_parallel_tree=None,
predictor=None, random_state=42, reg_alpha=None,
reg_lambda=None, scale_pos_weight=None,
subsample=None, tree_method=None,
use_label_encoder=False,
validate_parameters=None, verbosity=0))])
key: fit_time
value: [0.43243599 0.37717295 0.3817358 0.54870081 0.41225386 0.39443135
0.3919251 0.39013386 0.4919126 0.42256212]
mean value: 0.4243264436721802
key: score_time
value: [0.0123105 0.01224923 0.01234055 0.01304388 0.01266456 0.01245308
0.01253748 0.01215816 0.01302004 0.01320338]
mean value: 0.012598085403442382
key: test_mcc
value: [0.4285015 0.41435599 0.38358846 0.4621478 0.46217442 0.44658645
0.3373794 0.3699362 0.50527162 0.43330622]
mean value: 0.42432480631790864
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.57954545 0.60098522 0.5505618 0.61621622 0.60674157 0.6
0.51428571 0.54444444 0.65263158 0.56097561]
mean value: 0.582638761065669
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.66233766 0.58653846 0.62025316 0.6627907 0.6835443 0.65853659
0.58441558 0.60493827 0.68131868 0.70769231]
mean value: 0.6452365720302338
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.51515152 0.61616162 0.49494949 0.57575758 0.54545455 0.55102041
0.45918367 0.49494949 0.62626263 0.46464646]
mean value: 0.5343537414965985
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.7672956 0.74528302 0.74842767 0.77672956 0.77987421 0.77287066
0.7318612 0.74132492 0.79179811 0.77287066]
mean value: 0.7628335614943553
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: [0.69821503 0.70990729 0.6789816 0.72166874 0.71564965 0.71158326
0.65653248 0.67408025 0.74661755 0.68874525]
mean value: 0.700198109908787
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.408 0.42957746 0.37984496 0.4453125 0.43548387 0.42857143
0.34615385 0.3740458 0.484375 0.38983051]
mean value: 0.41211953817233526
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: 0.16
Accuracy on Blind test: 0.79
Extracting tts_split_name: logo_skf_BT_embb
Total cols in each df:
CV df: 8
metaDF: 17
Adding column: Model_name
Total cols in bts df:
BT_df: 8
First proceeding to rowbind CV and BT dfs:
Final output should have: 25 columns
Combinig 2 using pd.concat by row ~ rowbind
Checking Dims of df to combine:
Dim of CV: (24, 8)
Dim of BT: (24, 8)
8
Number of Common columns: 8
These are: ['Precision', 'Accuracy', 'source_data', 'F1', 'Recall', 'MCC', 'ROC_AUC', 'JCC']
Concatenating dfs with different resampling methods [WF]:
Split type: logo_skf_BT_embb
No. of dfs combining: 2
PASS: 2 dfs successfully combined
nrows in combined_df_wf: 48
ncols in combined_df_wf: 8
PASS: proceeding to merge metadata with CV and BT dfs
Adding column: Model_name
=========================================================
SUCCESS: Ran multiple classifiers
=======================================================
BTS gene: katg
Total genes: 6
Training on: 5
Training on genes: ['alr', 'pnca', 'gid', 'rpob', 'embb']
Omitted genes: ['katg']
Blind test gene: katg
/home/tanu/git/Data/ml_combined/6genes_logo_skf_BT_katg.csv
Training data dim: (3216, 171)
Training Target dim: (3216,)
Checked training df does NOT have Target var
TEST data dim: (817, 171)
TEST Target dim: (817,)
==============================================================
Running several classification models (n): 24
List of models:
('AdaBoost Classifier', AdaBoostClassifier(random_state=42))
('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42))
('Decision Tree', DecisionTreeClassifier(random_state=42))
('Extra Tree', ExtraTreeClassifier(random_state=42))
('Extra Trees', ExtraTreesClassifier(random_state=42))
('Gradient Boosting', GradientBoostingClassifier(random_state=42))
('Gaussian NB', GaussianNB())
('Gaussian Process', GaussianProcessClassifier(random_state=42))
('K-Nearest Neighbors', KNeighborsClassifier())
('LDA', LinearDiscriminantAnalysis())
('Logistic Regression', LogisticRegression(random_state=42))
('Logistic RegressionCV', LogisticRegressionCV(cv=3, random_state=42))
('MLP', MLPClassifier(max_iter=500, random_state=42))
('Multinomial', MultinomialNB())
('Naive Bayes', BernoulliNB())
('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42))
('QDA', QuadraticDiscriminantAnalysis())
('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42))
('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
n_estimators=1000, n_jobs=10, oob_score=True,
random_state=42))
('Ridge Classifier', RidgeClassifier(random_state=42))
('Ridge ClassifierCV', RidgeClassifierCV(cv=3))
('SVC', SVC(random_state=42))
('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42))
('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
colsample_bynode=None, colsample_bytree=None,
enable_categorical=False, gamma=None, gpu_id=None,
importance_type=None, interaction_constraints=None,
learning_rate=None, max_delta_step=None, max_depth=None,
min_child_weight=None, missing=nan, monotone_constraints=None,
n_estimators=100, n_jobs=None, num_parallel_tree=None,
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
scale_pos_weight=None, subsample=None, tree_method=None,
use_label_encoder=False, validate_parameters=None, verbosity=0))
================================================================
Running classifier: 1
Model_name: AdaBoost Classifier
Model func: AdaBoostClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', AdaBoostClassifier(random_state=42))])
key: fit_time
value: [0.74429488 0.74938488 0.7545929 0.76155329 0.76062679 0.75680184
0.73600912 0.72930002 0.72835684 0.72488666]
mean value: 0.7445807218551636
key: score_time
value: [0.02021694 0.0207057 0.01967144 0.02045751 0.02044201 0.02038002
0.01923156 0.01924586 0.01960039 0.02000499]
mean value: 0.019995641708374024
key: test_mcc
value: [0.49405839 0.55014005 0.43478951 0.36293649 0.48757667 0.469056
0.53923678 0.38600631 0.41760296 0.43815694]
mean value: 0.4579560093913189
key: train_mcc
value: [0.53535154 0.52014106 0.55136802 0.56560992 0.53976374 0.524511
0.54165899 0.55897827 0.5472762 0.53022096]
mean value: 0.5414879698170099
key: test_fscore
value: [0.576 0.5982906 0.51666667 0.46280992 0.5942029 0.57971014
0.6259542 0.52112676 0.51612903 0.54263566]
mean value: 0.5533525876000354
key: train_fscore
value: [0.61417323 0.60262009 0.6277245 0.63874346 0.62264151 0.60914582
0.61954625 0.63815227 0.62768702 0.61164205]
mean value: 0.6212076176105658
key: test_precision
value: [0.72 0.83333333 0.68888889 0.60869565 0.65079365 0.63492063
0.71929825 0.54411765 0.64 0.63636364]
mean value: 0.6676411689146915
key: train_precision
value: [0.74364407 0.7278481 0.75630252 0.77052632 0.73333333 0.72336066
0.74894515 0.75050302 0.74338086 0.7348643 ]
mean value: 0.743270831674278
key: test_recall
value: [0.48 0.46666667 0.41333333 0.37333333 0.54666667 0.53333333
0.55405405 0.5 0.43243243 0.47297297]
mean value: 0.4772792792792793
key: train_recall
value: [0.52309985 0.51415797 0.53651267 0.54545455 0.54098361 0.52608048
0.52827381 0.55505952 0.54315476 0.52380952]
mean value: 0.5336586739762968
key: test_accuracy
value: [0.83540373 0.85403727 0.81987578 0.79813665 0.82608696 0.81987578
0.84735202 0.78816199 0.81308411 0.81619938]
mean value: 0.8218213656856485
key: train_accuracy
value: [0.84761576 0.84277816 0.85245335 0.8569454 0.8479613 0.84346925
0.84939551 0.85388601 0.85043178 0.84559585]
mean value: 0.8490532374169414
key: test_roc_auc
value: [0.71165992 0.71916329 0.67832659 0.65022942 0.72879892 0.72010796
0.74463836 0.68724696 0.67977897 0.69600066]
mean value: 0.7015951052266842
key: train_roc_auc
value: [0.73433445 0.72806414 0.74216547 0.74821085 0.74080219 0.73267587
0.73737127 0.74963952 0.7432373 0.73333976]
mean value: 0.7389840828271048
key: test_jcc
value: [0.40449438 0.42682927 0.34831461 0.30107527 0.42268041 0.40816327
0.45555556 0.35238095 0.34782609 0.37234043]
mean value: 0.3839660223976133
key: train_jcc
value: [0.44318182 0.43125 0.45743329 0.46923077 0.45205479 0.43796526
0.44879899 0.46859296 0.45739348 0.44055069]
mean value: 0.45064520589732815
MCC on Blind test: 0.13
Accuracy on Blind test: 0.58
Running classifier: 2
Model_name: Bagging Classifier
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model',
BaggingClassifier(n_jobs=10, oob_score=True,
random_state=42))])
key: fit_time
value: [0.36970258 0.41086721 0.35991597 0.39537287 0.37688065 0.39013028
0.36025476 0.27856636 0.40036249 0.40210557]
mean value: 0.37441587448120117
key: score_time
value: [0.04997492 0.03865147 0.03088331 0.02966785 0.02846932 0.05022717
0.03653717 0.02389169 0.04539776 0.04975224]
mean value: 0.03834528923034668
key: test_mcc
value: [0.50440718 0.41924466 0.44490067 0.4469218 0.39415232 0.44888192
0.47569604 0.39273031 0.43943885 0.43055258]
mean value: 0.4396926315173917
key: train_mcc
value: [0.93756254 0.95036311 0.95036311 0.95130159 0.94430028 0.94259767
0.93959365 0.94833367 0.95626962 0.95518225]
mean value: 0.947586750047255
key: test_fscore
value: [0.54867257 0.4957265 0.49090909 0.52892562 0.50769231 0.546875
0.55737705 0.49180328 0.5210084 0.53846154]
mean value: 0.5227451350226022
key: train_fscore
value: [0.95015576 0.96055684 0.96055684 0.96141975 0.95612009 0.95401403
0.95186335 0.95926211 0.96541122 0.9648318 ]
mean value: 0.958419181394594
key: test_precision
value: [0.81578947 0.69047619 0.77142857 0.69565217 0.6 0.66037736
0.70833333 0.625 0.68888889 0.625 ]
mean value: 0.6880945990214804
key: train_precision
value: [0.99510604 0.99839228 0.99839228 0.9968 0.9888535 1.
0.99512987 0.99205087 0.99841017 0.99213836]
mean value: 0.9955273389184505
key: test_recall
value: [0.41333333 0.38666667 0.36 0.42666667 0.44 0.46666667
0.45945946 0.40540541 0.41891892 0.47297297]
mean value: 0.425009009009009
key: train_recall
value: [0.90909091 0.92548435 0.92548435 0.92846498 0.92548435 0.91207154
0.91220238 0.92857143 0.93452381 0.9389881 ]
mean value: 0.9240366191185864
key: test_accuracy
value: [0.84161491 0.81677019 0.82608696 0.82298137 0.80124224 0.81987578
0.8317757 0.80685358 0.82242991 0.81308411]
mean value: 0.8202714730752115
key: train_accuracy
value: [0.97788528 0.98237733 0.98237733 0.98272287 0.98030408 0.97961299
0.97858377 0.98169257 0.98445596 0.98411054]
mean value: 0.9814122721896175
key: test_roc_auc
value: [0.69249663 0.66701754 0.66380567 0.68499325 0.67546559 0.69689609
0.70138965 0.66626546 0.68111938 0.69397637]
mean value: 0.6823425611846663
key: train_roc_auc
value: [0.95387069 0.96251725 0.96251725 0.96378265 0.96116773 0.95603577
0.95542643 0.96316111 0.96703698 0.96836944]
mean value: 0.9613885299821453
key: test_jcc
value: [0.37804878 0.32954545 0.3253012 0.35955056 0.34020619 0.37634409
0.38636364 0.32608696 0.35227273 0.36842105]
mean value: 0.35421406460284866
key: train_jcc
value: [0.90504451 0.92410714 0.92410714 0.92570579 0.9159292 0.91207154
0.90814815 0.92171344 0.93313522 0.93205318]
mean value: 0.9202015310641395
MCC on Blind test: 0.09
Accuracy on Blind test: 0.56
Running classifier: 3
Model_name: Decision Tree
Model func: DecisionTreeClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', DecisionTreeClassifier(random_state=42))])
key: fit_time
value: [0.20532584 0.19792724 0.19911909 0.19889903 0.19191718 0.1820128
0.2169416 0.18759751 0.21631169 0.22144818]
mean value: 0.20175001621246338
key: score_time
value: [0.01008534 0.01013112 0.01021385 0.01014996 0.00989389 0.01073194
0.01058435 0.01002741 0.01027775 0.01041842]
mean value: 0.010251402854919434
key: test_mcc
value: [0.29825329 0.36263888 0.29825329 0.24152322 0.30187859 0.30505077
0.27467098 0.3620166 0.33264033 0.25824772]
mean value: 0.30351736659523143
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.45945946 0.50684932 0.45945946 0.42767296 0.47133758 0.47852761
0.45783133 0.51006711 0.48648649 0.44025157]
mean value: 0.46979428751507485
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.46575342 0.52112676 0.46575342 0.4047619 0.45121951 0.44318182
0.41304348 0.50666667 0.48648649 0.41176471]
mean value: 0.4569758182313669
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.45333333 0.49333333 0.45333333 0.45333333 0.49333333 0.52
0.51351351 0.51351351 0.48648649 0.47297297]
mean value: 0.4853153153153153
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.7515528 0.77639752 0.7515528 0.7173913 0.74223602 0.73602484
0.71962617 0.77258567 0.76323988 0.72274143]
mean value: 0.7453348425920552
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: [0.6477193 0.67784076 0.6477193 0.62545209 0.65557355 0.66080972
0.64744502 0.68185797 0.66632017 0.63527191]
mean value: 0.6546009774957142
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.29824561 0.33944954 0.29824561 0.272 0.30833333 0.31451613
0.296875 0.34234234 0.32142857 0.28225806]
mean value: 0.30736942100072134
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: 0.1
Accuracy on Blind test: 0.57
Running classifier: 4
Model_name: Extra Tree
Model func: ExtraTreeClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', ExtraTreeClassifier(random_state=42))])
key: fit_time
value: [0.02114892 0.02141356 0.02144313 0.02167511 0.02187777 0.02149224
0.02151847 0.02173924 0.02138305 0.02175546]
mean value: 0.021544694900512695
key: score_time
value: [0.00971389 0.00987816 0.00978661 0.00981236 0.00979972 0.00991654
0.00973964 0.00974202 0.00971723 0.00974655]
mean value: 0.009785270690917969
key: test_mcc
value: [0.32719504 0.09614035 0.29175121 0.07436741 0.17629851 0.20544197
0.24682637 0.20970566 0.21775412 0.27511124]
mean value: 0.21205918643483992
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.47945205 0.30666667 0.45205479 0.29139073 0.38554217 0.38926174
0.42384106 0.39189189 0.39160839 0.46242775]
mean value: 0.397413724686737
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.49295775 0.30666667 0.46478873 0.28947368 0.35164835 0.39189189
0.41558442 0.39189189 0.4057971 0.4040404 ]
mean value: 0.3914740886256663
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.46666667 0.30666667 0.44 0.29333333 0.42666667 0.38666667
0.43243243 0.39189189 0.37837838 0.54054054]
mean value: 0.4063243243243243
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.76397516 0.67701863 0.7515528 0.66770186 0.68322981 0.7173913
0.72897196 0.71962617 0.72897196 0.71028037]
mean value: 0.714872003250711
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: [0.66045884 0.54807018 0.64307692 0.53735493 0.59390013 0.60224022
0.6251231 0.60485283 0.60619324 0.65083707]
mean value: 0.6072107451581136
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.31531532 0.18110236 0.2920354 0.17054264 0.23880597 0.24166667
0.26890756 0.24369748 0.24347826 0.30075188]
mean value: 0.24963035308105833
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: 0.04
Accuracy on Blind test: 0.53
Running classifier: 5
Model_name: Extra Trees
Model func: ExtraTreesClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', ExtraTreesClassifier(random_state=42))])
key: fit_time
value: [0.52306628 0.49588966 0.50772023 0.48006439 0.49682546 0.50883174
0.4852891 0.4813776 0.48473597 0.4788816 ]
mean value: 0.49426820278167727
key: score_time
value: [0.02674747 0.02696633 0.02436972 0.02601957 0.02643299 0.02625871
0.02477264 0.02506065 0.02469015 0.02463913]
mean value: 0.025595736503601075
key: test_mcc
value: [0.36346175 0.31918781 0.40636016 0.3818016 0.36760849 0.45074569
0.36803788 0.40465381 0.35434479 0.43421317]
mean value: 0.38504151518141727
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.39215686 0.34343434 0.44859813 0.42990654 0.47154472 0.51724138
0.43636364 0.46846847 0.42201835 0.52459016]
mean value: 0.4454322591224521
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.74074074 0.70833333 0.75 0.71875 0.60416667 0.73170732
0.66666667 0.7027027 0.65714286 0.66666667]
mean value: 0.6946876950992806
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.26666667 0.22666667 0.32 0.30666667 0.38666667 0.4
0.32432432 0.35135135 0.31081081 0.43243243]
mean value: 0.3325585585585586
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.80745342 0.79813665 0.81677019 0.81055901 0.79813665 0.82608696
0.80685358 0.81619938 0.80373832 0.81931464]
mean value: 0.8103248776145971
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: [0.61916329 0.59916329 0.64380567 0.63511471 0.65487179 0.67773279
0.63787066 0.65340847 0.63111391 0.68382755]
mean value: 0.6436072145019514
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.24390244 0.20731707 0.28915663 0.27380952 0.30851064 0.34883721
0.27906977 0.30588235 0.26744186 0.35555556]
mean value: 0.28794830465145765
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: 0.14
Accuracy on Blind test: 0.58
Running classifier: 6
Model_name: Gradient Boosting
Model func: GradientBoostingClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', GradientBoostingClassifier(random_state=42))])
key: fit_time
value: [3.54735494 3.51562953 3.44478083 3.37995839 3.37361455 3.39749098
3.41077304 3.38306165 3.39238596 3.37152028]
mean value: 3.4216570138931273
key: score_time
value: [0.01157308 0.01044941 0.01057196 0.01047254 0.01043725 0.01081729
0.01058316 0.01019454 0.01055932 0.01113629]
mean value: 0.010679483413696289
key: test_mcc
value: [0.51954944 0.52945673 0.49644937 0.53264151 0.49488702 0.5529111
0.54479346 0.35641129 0.48467524 0.50213363]
mean value: 0.5013908809764891
key: train_mcc
value: [0.67486051 0.66693886 0.67370076 0.67144675 0.6748799 0.6714386
0.67188477 0.68644061 0.67638353 0.67981059]
mean value: 0.6747784866911827
key: test_fscore
value: [0.58333333 0.58823529 0.55932203 0.60162602 0.59854015 0.63636364
0.625 0.46774194 0.56198347 0.58730159]
mean value: 0.5809447453818325
key: train_fscore
value: [0.71799463 0.71095153 0.72035398 0.71808511 0.72295515 0.71758437
0.71758437 0.73175022 0.72261484 0.72790901]
mean value: 0.720778319944263
key: test_precision
value: [0.77777778 0.79545455 0.76744186 0.77083333 0.66129032 0.73684211
0.74074074 0.58 0.72340426 0.71153846]
mean value: 0.7265323402472927
key: train_precision
value: [0.89910314 0.89390519 0.88671024 0.88621444 0.88197425 0.88791209
0.88986784 0.89462366 0.88913043 0.88322718]
mean value: 0.8892668457717798
key: test_recall
value: [0.46666667 0.46666667 0.44 0.49333333 0.54666667 0.56
0.54054054 0.39189189 0.45945946 0.5 ]
mean value: 0.4865225225225225
key: train_recall
value: [0.5976155 0.59016393 0.60655738 0.60357675 0.61251863 0.60208644
0.60119048 0.61904762 0.60863095 0.61904762]
mean value: 0.6060435295578739
key: test_accuracy
value: [0.8447205 0.84782609 0.83850932 0.84782609 0.82919255 0.85093168
0.85046729 0.79439252 0.83489097 0.83800623]
mean value: 0.8376763220525918
key: train_accuracy
value: [0.89115411 0.88873531 0.89080857 0.89011748 0.89115411 0.89011748
0.89015544 0.89464594 0.89153713 0.8925734 ]
mean value: 0.8910998993808882
key: test_roc_auc
value: [0.71309042 0.71511471 0.69975709 0.72439946 0.73082321 0.74963563
0.74193019 0.65343582 0.70341394 0.71963563]
mean value: 0.7151236094393989
key: train_roc_auc
value: [0.78868629 0.78451067 0.79158278 0.79009247 0.79388864 0.78957223
0.78934917 0.79850267 0.79284449 0.79715314]
mean value: 0.7916182560070071
key: test_jcc
value: [0.41176471 0.41666667 0.38823529 0.43023256 0.42708333 0.46666667
0.45454545 0.30526316 0.3908046 0.41573034]
mean value: 0.4106992772026194
key: train_jcc
value: [0.56005587 0.55153203 0.56293223 0.56016598 0.5661157 0.55955679
0.55955679 0.57697642 0.56569848 0.57221458]
mean value: 0.5634804857836729
MCC on Blind test: 0.15
Accuracy on Blind test: 0.58
Running classifier: 7
Model_name: Gaussian NB
Model func: GaussianNB()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', GaussianNB())])
key: fit_time
value: [0.02233529 0.02248764 0.02127099 0.0224936 0.02159405 0.02197123
0.02259278 0.02309656 0.02119684 0.02321959]
mean value: 0.02222585678100586
key: score_time
value: [0.01115465 0.01035666 0.01053786 0.01111293 0.01095581 0.01114631
0.01114392 0.01076579 0.01023817 0.01121664]
mean value: 0.01086287498474121
key: test_mcc
value: [0.23948505 0.25217898 0.2991717 0.21673455 0.31340684 0.28005604
0.16254629 0.27172817 0.24059051 0.27201 ]
mean value: 0.254790814776087
key: train_mcc
value: [0.26805394 0.25949345 0.26290486 0.25728432 0.26878368 0.26362785
0.26745478 0.273496 0.26106402 0.26703112]
mean value: 0.26491940211153575
key: test_fscore
value: [0.45 0.4494382 0.49019608 0.43809524 0.5 0.4784689
0.4 0.47169811 0.44198895 0.47222222]
mean value: 0.45921077040013447
key: train_fscore
value: [0.46843854 0.46330778 0.46534653 0.46237141 0.469163 0.46615721
0.46905537 0.47252747 0.46578249 0.46880087]
mean value: 0.4670950671996213
key: test_precision
value: [0.36 0.38834951 0.3875969 0.34074074 0.38686131 0.37313433
0.3129771 0.36231884 0.37383178 0.35915493]
mean value: 0.36449654418502264
key: train_precision
value: [0.37268722 0.36623377 0.36878814 0.36309524 0.3720524 0.36778639
0.36923077 0.37456446 0.36191261 0.36891546]
mean value: 0.3685266464160096
key: test_recall
value: [0.6 0.53333333 0.66666667 0.61333333 0.70666667 0.66666667
0.55405405 0.67567568 0.54054054 0.68918919]
mean value: 0.6246126126126126
key: train_recall
value: [0.63040238 0.63040238 0.63040238 0.63636364 0.63487332 0.63636364
0.64285714 0.63988095 0.65327381 0.64285714]
mean value: 0.6377676797246469
key: test_accuracy
value: [0.65838509 0.69565217 0.67701863 0.63354037 0.67080745 0.66149068
0.61682243 0.65109034 0.68535826 0.64485981]
mean value: 0.6595025251059383
key: train_accuracy
value: [0.6682792 0.66136835 0.66413269 0.6568763 0.66689703 0.66205943
0.66217617 0.66839378 0.65215889 0.66183074]
mean value: 0.6624172577890293
key: test_roc_auc
value: [0.63805668 0.6391363 0.6734143 0.62650472 0.68329285 0.66329285
0.59484079 0.65970019 0.63464274 0.66038407]
mean value: 0.6473265492212861
key: train_roc_auc
value: [0.65505724 0.65055882 0.65235819 0.64971578 0.65571826 0.6530896
0.65543667 0.658447 0.65254784 0.65521175]
mean value: 0.6538141142662821
key: test_jcc
value: [0.29032258 0.28985507 0.32467532 0.2804878 0.33333333 0.31446541
0.25 0.30864198 0.28368794 0.30909091]
mean value: 0.298456035246263
key: train_jcc
value: [0.30585683 0.30149679 0.30322581 0.30070423 0.30647482 0.30391459
0.30638298 0.30935252 0.30359613 0.30616584]
mean value: 0.3047170532040558
MCC on Blind test: 0.2
Accuracy on Blind test: 0.61
Running classifier: 8
Model_name: Gaussian Process
Model func: GaussianProcessClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', GaussianProcessClassifier(random_state=42))])
key: fit_time
value: [3.79316545 3.77993989 3.58109045 4.08095026 3.64671731 3.47795367
3.65127897 3.62924051 3.82486391 3.69410038]
mean value: 3.7159300804138184
key: score_time
value: [0.09907436 0.09605098 0.09689355 0.09686065 0.09744859 0.09700894
0.09645844 0.09619999 0.0967989 0.0965364 ]
mean value: 0.09693307876586914
key: test_mcc
value: [0.1631053 0.1631053 0.33490183 0.31480865 0.2002288 0.27509518
0.22927073 0.31648393 0.24850712 0.39376149]
mean value: 0.2639268345032747
key: train_mcc
value: [0.53167932 0.53763718 0.536737 0.53715981 0.5252782 0.5363085
0.53623235 0.54129628 0.54670153 0.52647795]
mean value: 0.5355508125746825
key: test_fscore
value: [0.16091954 0.16091954 0.27272727 0.31578947 0.24742268 0.30612245
0.2247191 0.28888889 0.24444444 0.41584158]
mean value: 0.2637794974878561
key: train_fscore
value: [0.51803279 0.51762115 0.51973684 0.52344602 0.50498339 0.52078775
0.51916758 0.53030303 0.53101197 0.50989011]
mean value: 0.5194980618002966
key: test_precision
value: [0.58333333 0.58333333 0.92307692 0.75 0.54545455 0.65217391
0.66666667 0.8125 0.6875 0.77777778]
mean value: 0.6981816492686058
key: train_precision
value: [0.97131148 0.99156118 0.98340249 0.97560976 0.98275862 0.97942387
0.98340249 0.97222222 0.98785425 0.97478992]
mean value: 0.9802336270398275
key: test_recall
value: [0.09333333 0.09333333 0.16 0.2 0.16 0.2
0.13513514 0.17567568 0.14864865 0.28378378]
mean value: 0.164990990990991
key: train_recall
value: [0.35320417 0.35022355 0.35320417 0.35767511 0.33979136 0.35469449
0.35267857 0.36458333 0.36309524 0.3452381 ]
mean value: 0.3534388084593003
key: test_accuracy
value: [0.77329193 0.77329193 0.80124224 0.79813665 0.77329193 0.78881988
0.78504673 0.80062305 0.78816199 0.81619938]
mean value: 0.7898105686809467
key: train_accuracy
value: [0.84761576 0.84865238 0.84865238 0.84899793 0.8455425 0.84865238
0.84835924 0.85008636 0.85112263 0.84594128]
mean value: 0.848362283707701
key: test_roc_auc
value: [0.53654521 0.53654521 0.57797571 0.58987854 0.55975709 0.58380567
0.55744611 0.58176496 0.56420287 0.62974614]
mean value: 0.5717667505562243
key: train_roc_auc
value: [0.67502764 0.67466193 0.6757024 0.67748803 0.66899599 0.67622264
0.6754396 0.68071722 0.68087286 0.67126952]
mean value: 0.6756397820973963
key: test_jcc
value: [0.0875 0.0875 0.15789474 0.1875 0.14117647 0.18072289
0.12658228 0.16883117 0.13924051 0.2625 ]
mean value: 0.1539448052637901
key: train_jcc
value: [0.34955752 0.34918276 0.35111111 0.35450517 0.33777778 0.35207101
0.35059172 0.36082474 0.36148148 0.34218289]
mean value: 0.3509286181122742
MCC on Blind test: 0.09
Accuracy on Blind test: 0.56
Running classifier: 9
Model_name: K-Nearest Neighbors
Model func: KNeighborsClassifier()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', KNeighborsClassifier())])
key: fit_time
value: [0.02325869 0.01704717 0.01801252 0.01943851 0.01945305 0.01974058
0.01753187 0.01988554 0.01897764 0.01963639]
mean value: 0.019298195838928223
key: score_time
value: [0.04681897 0.02970815 0.02862167 0.02946496 0.02901721 0.02905989
0.02940273 0.02625632 0.02601814 0.02773595]
mean value: 0.030210399627685548
key: test_mcc
value: [0.20885187 0.23445029 0.20252962 0.22417521 0.14003452 0.2232046
0.17303136 0.27427585 0.24674878 0.25464306]
mean value: 0.2181945145344219
key: train_mcc
value: [0.45122639 0.4465095 0.44522895 0.45502748 0.43104542 0.44979451
0.46338368 0.43746094 0.4496036 0.43002262]
mean value: 0.4459303066847749
key: test_fscore
value: [0.30909091 0.31775701 0.28571429 0.32432432 0.26548673 0.35772358
0.28828829 0.36363636 0.32380952 0.39370079]
mean value: 0.32295317945105534
key: train_fscore
value: [0.51361868 0.51067961 0.50588235 0.52651515 0.49007937 0.50787402
0.52376334 0.50387597 0.51262136 0.49806202]
mean value: 0.5092971854262877
key: test_precision
value: [0.48571429 0.53125 0.5 0.5 0.39473684 0.45833333
0.43243243 0.55555556 0.5483871 0.47169811]
mean value: 0.48781076591226114
key: train_precision
value: [0.7394958 0.73259053 0.73925501 0.72207792 0.73293769 0.74782609
0.75208914 0.72222222 0.73743017 0.71388889]
mean value: 0.7339813451587397
key: test_recall
value: [0.22666667 0.22666667 0.2 0.24 0.2 0.29333333
0.21621622 0.27027027 0.22972973 0.33783784]
mean value: 0.24407207207207207
key: train_recall
value: [0.39344262 0.39195231 0.38450075 0.414307 0.3681073 0.38450075
0.40178571 0.38690476 0.39285714 0.38244048]
mean value: 0.39007988254914483
key: test_accuracy
value: [0.76397516 0.77329193 0.76708075 0.76708075 0.74223602 0.75465839
0.75389408 0.78193146 0.7788162 0.76012461]
mean value: 0.7643089336506645
key: train_accuracy
value: [0.82722875 0.82584658 0.82584658 0.82722875 0.82239115 0.82722875
0.83039724 0.82314335 0.82659758 0.82107081]
mean value: 0.8256979540780579
key: test_roc_auc
value: [0.57689609 0.58296896 0.56963563 0.58356275 0.5534413 0.59403509
0.56559799 0.60274647 0.58652478 0.61223876]
mean value: 0.5827647809753074
key: train_roc_auc
value: [0.67580363 0.67438371 0.67178254 0.68308693 0.66381074 0.67268222
0.68087486 0.67096025 0.67528597 0.66805335]
mean value: 0.6736724199070511
key: test_jcc
value: [0.1827957 0.18888889 0.16666667 0.19354839 0.15306122 0.21782178
0.16842105 0.22222222 0.19318182 0.24509804]
mean value: 0.19317057804963803
key: train_jcc
value: [0.34554974 0.34289439 0.33858268 0.35732648 0.32457293 0.34036939
0.35479632 0.33678756 0.34464752 0.3316129 ]
mean value: 0.34171399189764795
MCC on Blind test: 0.1
Accuracy on Blind test: 0.57
Running classifier: 10
Model_name: LDA
Model func: LinearDiscriminantAnalysis()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', LinearDiscriminantAnalysis())])
key: fit_time
value: [0.11944056 0.11581206 0.11613488 0.11338544 0.11381531 0.11223698
0.11461377 0.12243223 0.11454034 0.11531353]
mean value: 0.11577250957489013
key: score_time
value: [0.02596951 0.01333499 0.01338267 0.01316214 0.01321292 0.01334953
0.01331091 0.0133028 0.01325774 0.01319242]
mean value: 0.01454756259918213
key: test_mcc
value: [0.47485004 0.39240954 0.38845268 0.4469218 0.43493982 0.36445258
0.49649979 0.37417573 0.43621849 0.42545917]
mean value: 0.4234379638909915
key: train_mcc
value: [0.47740314 0.49341793 0.49212982 0.49535513 0.49666655 0.48904343
0.48744022 0.49864682 0.49059734 0.49650302]
mean value: 0.4917203396529928
key: test_fscore
value: [0.54237288 0.48333333 0.47457627 0.52892562 0.55944056 0.49635036
0.59701493 0.48818898 0.51282051 0.54411765]
mean value: 0.5227141091744903
key: train_fscore
value: [0.56510186 0.57651246 0.57524488 0.57754011 0.58223395 0.57644991
0.57369815 0.58421053 0.57422222 0.58413252]
mean value: 0.5769346578090504
key: test_precision
value: [0.74418605 0.64444444 0.65116279 0.69565217 0.58823529 0.5483871
0.66666667 0.58490566 0.69767442 0.59677419]
mean value: 0.6418088785655695
key: train_precision
value: [0.69650655 0.71523179 0.71460177 0.71840355 0.71030043 0.70235546
0.70498915 0.71153846 0.71302428 0.70526316]
mean value: 0.7092214601458063
key: test_recall
value: [0.42666667 0.38666667 0.37333333 0.42666667 0.53333333 0.45333333
0.54054054 0.41891892 0.40540541 0.5 ]
mean value: 0.4464864864864865
key: train_recall
value: [0.47540984 0.4828614 0.48137109 0.4828614 0.49329359 0.48882265
0.48363095 0.49553571 0.48065476 0.4985119 ]
mean value: 0.48629533035270744
key: test_accuracy
value: [0.83229814 0.80745342 0.80745342 0.82298137 0.80434783 0.78571429
0.8317757 0.79750779 0.82242991 0.80685358]
mean value: 0.8118815425398116
key: train_accuracy
value: [0.83033863 0.83552177 0.83517623 0.83621285 0.83586731 0.83344851
0.83316062 0.83626943 0.83454231 0.83523316]
mean value: 0.8345770834303119
key: test_roc_auc
value: [0.69106613 0.66094467 0.65630229 0.68499325 0.7099865 0.6699865
0.72978444 0.66492505 0.67638691 0.69939271]
mean value: 0.6843768464821097
key: train_roc_auc
value: [0.70644086 0.71241586 0.7116707 0.7128657 0.71628242 0.71314727
0.71122618 0.71740348 0.71108761 0.71776697]
mean value: 0.7130307061120261
key: test_jcc
value: [0.37209302 0.31868132 0.31111111 0.35955056 0.38834951 0.33009709
0.42553191 0.32291667 0.34482759 0.37373737]
mean value: 0.35468961582922987
key: train_jcc
value: [0.39382716 0.405 0.40375 0.40601504 0.41066998 0.40493827
0.40222772 0.41263941 0.40274314 0.41256158]
mean value: 0.4054372291354911
MCC on Blind test: 0.23
Accuracy on Blind test: 0.61
Running classifier: 11
Model_name: Logistic Regression
Model func: LogisticRegression(random_state=42)
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', LogisticRegression(random_state=42))])
key: fit_time
value: [0.07580376 0.06974602 0.11002922 0.11698771 0.08370996 0.07091403
0.06857276 0.07030511 0.06945276 0.07148576]
mean value: 0.08070070743560791
key: score_time
value: [0.01987982 0.01836348 0.02074289 0.02028203 0.01625824 0.01768279
0.01709962 0.01752877 0.01774192 0.01661849]
mean value: 0.018219804763793944
key: test_mcc
value: [0.44849397 0.37408603 0.40636016 0.44389334 0.41816554 0.38683591
0.43421317 0.38403926 0.41491889 0.40374924]
mean value: 0.411475551488504
key: train_mcc
value: [0.46589647 0.46664047 0.46305167 0.45136572 0.47210351 0.4672281
0.46352623 0.47305201 0.45653837 0.47390592]
mean value: 0.46533084767284105
key: test_fscore
value: [0.50877193 0.43636364 0.44859813 0.5210084 0.53030303 0.50381679
0.52459016 0.47457627 0.47272727 0.51908397]
mean value: 0.4939839601900612
key: train_fscore
value: [0.54054054 0.54018692 0.54193548 0.52847806 0.54847645 0.54189944
0.54193548 0.55404178 0.53432282 0.5498155 ]
mean value: 0.5421632476086466
key: test_precision
value: [0.74358974 0.68571429 0.75 0.70454545 0.61403509 0.58928571
0.66666667 0.63636364 0.72222222 0.59649123]
mean value: 0.6708914039177196
key: train_precision
value: [0.72139303 0.72431078 0.71014493 0.7075 0.72087379 0.72208437
0.71186441 0.71095571 0.70935961 0.72330097]
mean value: 0.7161787587478372
key: test_recall
value: [0.38666667 0.32 0.32 0.41333333 0.46666667 0.44
0.43243243 0.37837838 0.35135135 0.45945946]
mean value: 0.39682882882882886
key: train_recall
value: [0.43219076 0.43070045 0.43815201 0.42175857 0.44262295 0.43368107
0.4375 0.45386905 0.42857143 0.44345238]
mean value: 0.4362498669363424
key: test_accuracy
value: [0.82608696 0.80745342 0.81677019 0.82298137 0.80745342 0.79813665
0.81931464 0.80685358 0.81931464 0.80373832]
mean value: 0.8128103171378264
key: train_accuracy
value: [0.82964755 0.82999309 0.82826538 0.82550104 0.83102972 0.82999309
0.8283247 0.83039724 0.82659758 0.83143351]
mean value: 0.8291182877324653
key: test_roc_auc
value: [0.67309042 0.63773279 0.64380567 0.68035088 0.68879892 0.6734413
0.68382755 0.65680053 0.65543276 0.68317103]
mean value: 0.6676451836451837
key: train_roc_auc
value: [0.6909042 0.69060888 0.69208545 0.6845635 0.69544553 0.69164935
0.69198437 0.69904429 0.687745 0.69608516]
mean value: 0.6920115731442845
key: test_jcc
value: [0.34117647 0.27906977 0.28915663 0.35227273 0.36082474 0.33673469
0.35555556 0.31111111 0.30952381 0.35051546]
mean value: 0.32859409680624413
key: train_jcc
value: [0.37037037 0.37003841 0.37168142 0.35913706 0.3778626 0.37164751
0.37168142 0.38316583 0.36455696 0.37913486]
mean value: 0.37192764265786016
MCC on Blind test: 0.19
Accuracy on Blind test: 0.6
Running classifier: 12
Model_name: Logistic RegressionCV
Model func: LogisticRegressionCV(cv=3, random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', LogisticRegressionCV(cv=3, random_state=42))])
key: fit_time
value: [0.95303845 0.8442843 0.85847378 1.06652308 0.86740303 1.04451871
0.83575869 0.85381722 0.98099589 0.85632706]
mean value: 0.9161140203475953
key: score_time
value: [0.01372981 0.01369858 0.0136714 0.01386285 0.01369905 0.01369166
0.01401448 0.01373887 0.01373386 0.01366496]
mean value: 0.013750553131103516
key: test_mcc
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: train_mcc
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: test_fscore
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: train_fscore
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: test_precision
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: train_precision
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: test_recall
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: train_recall
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: test_accuracy
value: [0.76708075 0.76708075 0.76708075 0.76708075 0.76708075 0.76708075
0.7694704 0.7694704 0.7694704 0.7694704 ]
mean value: 0.7680366091987384
key: train_accuracy
value: [0.76814098 0.76814098 0.76814098 0.76814098 0.76814098 0.76814098
0.76787565 0.76787565 0.76787565 0.76787565]
mean value: 0.7680348478717803
key: test_roc_auc
value: [0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5]
mean value: 0.5
key: train_roc_auc
value: [0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5]
mean value: 0.5
key: test_jcc
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: train_jcc
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
MCC on Blind test: 0.0
Accuracy on Blind test: 0.55
Running classifier: 13
Model_name: MLP
Model func: MLPClassifier(max_iter=500, random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', MLPClassifier(max_iter=500, random_state=42))])
key: fit_time
value: [ 3.8930521 10.4773488 6.92323995 6.88646317 10.01851606 5.48201418
9.31204343 6.42008114 5.14960241 8.51365113]
mean value: 7.3076012372970585
key: score_time
value: [0.01381254 0.01391196 0.01391768 0.0140512 0.01459169 0.01404238
0.01436186 0.01391315 0.01411676 0.01427174]
mean value: 0.01409909725189209
key: test_mcc
value: [0.45762004 0.35841753 0.38462263 0.39264701 0.34522396 0.35973947
0.38660377 0.36671003 0.49643747 0.30976458]
mean value: 0.3857786489811861
key: train_mcc
value: [0.55364567 0.66161727 0.64108979 0.66023306 0.71487014 0.56616796
0.67631506 0.57512984 0.57348968 0.63205833]
mean value: 0.6254616800520258
key: test_fscore
value: [0.56934307 0.43859649 0.46551724 0.52777778 0.48571429 0.46774194
0.5 0.45217391 0.57377049 0.42622951]
mean value: 0.49068647103202245
key: train_fscore
value: [0.64864865 0.71340929 0.69965278 0.74030724 0.7617421 0.59683794
0.7314578 0.62347418 0.64644714 0.67284523]
mean value: 0.6834822350450145
key: test_precision
value: [0.62903226 0.64102564 0.65853659 0.55072464 0.52307692 0.59183673
0.59259259 0.63414634 0.72916667 0.54166667]
mean value: 0.6091805047297312
key: train_precision
value: [0.69505963 0.86595745 0.83783784 0.72701149 0.892 0.8856305
0.85628743 0.84478372 0.77385892 0.89189189]
mean value: 0.8270318855862036
key: test_recall
value: [0.52 0.33333333 0.36 0.50666667 0.45333333 0.38666667
0.43243243 0.35135135 0.47297297 0.35135135]
mean value: 0.41681081081081084
key: train_recall
value: [0.60804769 0.60655738 0.60059613 0.75409836 0.66467958 0.45007452
0.63839286 0.49404762 0.55505952 0.54017857]
mean value: 0.5911732222695336
key: test_accuracy
value: [0.81677019 0.80124224 0.80745342 0.78881988 0.77639752 0.79503106
0.80062305 0.80373832 0.83800623 0.78193146]
mean value: 0.8010013351134846
key: train_accuracy
value: [0.84727021 0.8870076 0.88044229 0.87733241 0.90359364 0.85901866
0.89119171 0.86148532 0.85906736 0.87806563]
mean value: 0.8744474841044481
key: test_roc_auc
value: [0.7134413 0.63832659 0.65165992 0.6905803 0.66391363 0.6528475
0.6716818 0.6453113 0.7101707 0.63114126]
mean value: 0.6669074296969033
key: train_roc_auc
value: [0.76376294 0.78910865 0.7827542 0.83431414 0.82019404 0.71626533
0.8030021 0.73330361 0.75301334 0.76019275]
mean value: 0.7755911095603178
key: test_jcc
value: [0.39795918 0.28089888 0.30337079 0.35849057 0.32075472 0.30526316
0.33333333 0.29213483 0.40229885 0.27083333]
mean value: 0.3265337636210476
key: train_jcc
value: [0.48 0.55449591 0.53805073 0.58768873 0.61517241 0.42535211
0.5766129 0.45293315 0.47759283 0.50698324]
mean value: 0.5214882032205558
MCC on Blind test: 0.24
Accuracy on Blind test: 0.62
Running classifier: 14
Model_name: Multinomial
Model func: MultinomialNB()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', MultinomialNB())])
key: fit_time
value: [0.02436471 0.02455974 0.02453542 0.02460241 0.02486968 0.02455211
0.0250361 0.02492261 0.02510357 0.02481937]
mean value: 0.024736571311950683
key: score_time
value: [0.01328278 0.01326609 0.01337123 0.0132761 0.01329017 0.01343536
0.01343918 0.01339316 0.01334906 0.013273 ]
mean value: 0.01333761215209961
key: test_mcc
value: [0.2042361 0.20642253 0.24105806 0.1566726 0.27863285 0.20043185
0.13829095 0.26881917 0.17495462 0.17666914]
mean value: 0.204618787178046
key: train_mcc
value: [0.22173376 0.21522457 0.20263237 0.22175404 0.19996772 0.21195331
0.21925894 0.21579729 0.22790709 0.21908528]
mean value: 0.21553143868602848
key: test_fscore
value: [0.37410072 0.36363636 0.4028777 0.36809816 0.43661972 0.38666667
0.34210526 0.41791045 0.33070866 0.36111111]
mean value: 0.37838348088358015
key: train_fscore
value: [0.39116719 0.38164251 0.37984496 0.40179238 0.37519623 0.38118022
0.40089087 0.38301282 0.4045283 0.39940165]
mean value: 0.38986571402175485
key: test_precision
value: [0.40625 0.42105263 0.4375 0.34090909 0.46268657 0.38666667
0.33333333 0.46666667 0.39622642 0.37142857]
mean value: 0.40227199428417953
key: train_precision
value: [0.41541039 0.4150613 0.39579968 0.40269461 0.39635158 0.40994854
0.4 0.41493056 0.41041348 0.40150376]
mean value: 0.4062113877605794
key: test_recall
value: [0.34666667 0.32 0.37333333 0.4 0.41333333 0.38666667
0.35135135 0.37837838 0.28378378 0.35135135]
mean value: 0.3604864864864865
key: train_recall
value: [0.36959762 0.35320417 0.36512668 0.40089419 0.3561848 0.3561848
0.40178571 0.35565476 0.39880952 0.39732143]
mean value: 0.37547636789440064
key: test_accuracy
value: [0.72981366 0.73913043 0.74223602 0.68012422 0.7515528 0.71428571
0.68847352 0.75700935 0.73520249 0.71339564]
mean value: 0.7251223854027591
key: train_accuracy
value: [0.73324119 0.73462336 0.723566 0.72322046 0.72494817 0.73185902
0.72124352 0.73402418 0.72746114 0.72262522]
mean value: 0.7276812248079225
key: test_roc_auc
value: [0.59641026 0.59319838 0.61379217 0.58259109 0.63379217 0.60021592
0.57041252 0.62441186 0.57711456 0.58660685]
mean value: 0.5978545792756319
key: train_roc_auc
value: [0.60630128 0.60147838 0.59844278 0.6107035 0.59622105 0.60071948
0.60979974 0.60202891 0.61281007 0.60914205]
mean value: 0.6047647246579515
key: test_jcc
value: [0.2300885 0.22222222 0.25225225 0.22556391 0.27927928 0.23966942
0.20634921 0.26415094 0.19811321 0.22033898]
mean value: 0.2338027920934464
key: train_jcc
value: [0.24313725 0.2358209 0.23444976 0.25140187 0.23091787 0.23546798
0.25069638 0.23686819 0.25354778 0.24953271]
mean value: 0.24218406872006137
MCC on Blind test: 0.12
Accuracy on Blind test: 0.58
Running classifier: 15
Model_name: Naive Bayes
Model func: BernoulliNB()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', BernoulliNB())])
key: fit_time
value: [0.02731204 0.02934813 0.02717161 0.02741313 0.02751517 0.02728319
0.02772713 0.02758503 0.02789497 0.02753091]
mean value: 0.02767813205718994
key: score_time
value: [0.01370072 0.01387596 0.01367784 0.01366973 0.01379371 0.0138731
0.0138166 0.01367807 0.01377082 0.01381063]
mean value: 0.013766717910766602
key: test_mcc
value: [0.15085997 0.18533318 0.12659526 0.1263011 0.26526964 0.28362782
0.149913 0.23472095 0.03983718 0.24391921]
mean value: 0.18063773109847905
key: train_mcc
value: [0.21653983 0.21605277 0.21058328 0.21826885 0.21560555 0.20931897
0.2143105 0.21764474 0.23040719 0.20083551]
mean value: 0.21495671918217427
key: test_fscore
value: [0.28813559 0.28037383 0.26086957 0.29457364 0.38333333 0.41860465
0.29752066 0.37795276 0.19298246 0.4 ]
mean value: 0.31943464913232955
key: train_fscore
value: [0.34450652 0.34862385 0.35040431 0.34870849 0.34054563 0.33707865
0.34940855 0.34301781 0.36047575 0.34912281]
mean value: 0.347189236991495
key: test_precision
value: [0.39534884 0.46875 0.375 0.35185185 0.51111111 0.5
0.38297872 0.45283019 0.275 0.44262295]
mean value: 0.4155493663075438
key: train_precision
value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
[0.45905707 0.45346062 0.44117647 0.45762712 0.46173469 0.4534005
0.44964871 0.46329114 0.46793349 0.42521368]
mean value: 0.45325434974579853
key: test_recall
value: [0.22666667 0.2 0.2 0.25333333 0.30666667 0.36
0.24324324 0.32432432 0.14864865 0.36486486]
mean value: 0.26277477477477473
key: train_recall
value: [0.2757079 0.28315946 0.29061103 0.28166915 0.26974665 0.26825633
0.28571429 0.27232143 0.29315476 0.29613095]
mean value: 0.28164719501809665
key: test_accuracy
value: [0.73913043 0.76086957 0.73602484 0.7173913 0.77018634 0.76708075
0.73520249 0.75389408 0.71339564 0.74766355]
mean value: 0.7440838993053539
key: train_accuracy
value: [0.75673808 0.75466482 0.75017277 0.75604699 0.75777471 0.75535591
0.75302245 0.75785838 0.75854922 0.74369603]
mean value: 0.7543879362101089
key: test_roc_auc
value: [0.56070175 0.56558704 0.54939271 0.55581646 0.60879892 0.62534413
0.56291717 0.60345771 0.51561987 0.61360652]
mean value: 0.5761242294926505
key: train_roc_auc
value: [0.58882111 0.59007276 0.58974996 0.59045221 0.58741493 0.58532025
0.59000064 0.5884774 0.59619501 0.58756165]
mean value: 0.5894065934604653
key: test_jcc
value: [0.16831683 0.16304348 0.15 0.17272727 0.2371134 0.26470588
0.17475728 0.23300971 0.10679612 0.25 ]
mean value: 0.1920469973882224
key: train_jcc
value: [0.20809899 0.21111111 0.2124183 0.21117318 0.20521542 0.2027027
0.21168688 0.20701357 0.21986607 0.21147715]
mean value: 0.21007633838314238
MCC on Blind test: 0.13
Accuracy on Blind test: 0.58
Running classifier: 16
Model_name: Passive Aggresive
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model',
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
key: fit_time
value: [0.05420589 0.04669428 0.04829931 0.04162359 0.05197477 0.05355644
0.06082964 0.0556972 0.05445743 0.04880691]
mean value: 0.05161454677581787
key: score_time
value: [0.01202941 0.01342869 0.01359892 0.01345253 0.01336432 0.01387405
0.01338744 0.01328826 0.01346827 0.01345849]
mean value: 0.013335037231445312
key: test_mcc
value: [0.17313009 0. 0.19234543 0.0995128 0.07086565 0.34581909
0.24527992 0.30707773 0.4383594 0.27018706]
mean value: 0.21425771826701343
key: train_mcc
value: [0.17751895 0.04786569 0.26817597 0.12226679 0.18527118 0.416217
0.27795431 0.36563438 0.43253926 0.33494458]
mean value: 0.262838812539414
key: test_fscore
value: [0.41242938 0. 0.43055556 0.05128205 0.05063291 0.48175182
0.22727273 0.36538462 0.49090909 0.27956989]
mean value: 0.2789788047618155
key: train_fscore
value: [0.41248817 0.00594354 0.45999256 0.0380117 0.0960452 0.53278008
0.23243934 0.39120879 0.50572519 0.33681765]
mean value: 0.3011452215697504
key: test_precision
value: [0.26164875 0. 0.29107981 0.66666667 0.5 0.53225806
0.71428571 0.63333333 0.75 0.68421053]
mean value: 0.5033482862843919
key: train_precision
value: [0.2616 1. 0.30654762 1. 0.91891892 0.6011236
0.81981982 0.74789916 0.70478723 0.76719577]
mean value: 0.7127892114194162
key: test_recall
value: [0.97333333 0. 0.82666667 0.02666667 0.02666667 0.44
0.13513514 0.25675676 0.36486486 0.17567568]
mean value: 0.32257657657657657
key: train_recall
value: [0.97466468 0.00298063 0.92101341 0.01937407 0.05067064 0.47839046
0.13541667 0.26488095 0.39434524 0.21577381]
mean value: 0.34575105563835073
key: test_accuracy
value: [0.35403727 0.76708075 0.49068323 0.77018634 0.76708075 0.77950311
0.78816199 0.79439252 0.82554517 0.79127726]
mean value: 0.7127948375611928
key: train_accuracy
value: [0.35625432 0.76883207 0.49861783 0.77263303 0.7788528 0.80545957
0.79240069 0.80863558 0.82107081 0.80276339]
mean value: 0.7205520086224493
key: test_roc_auc
value: [0.56966262 0.5 0.60766532 0.51130904 0.50928475 0.66129555
0.5594704 0.60611117 0.66421381 0.57569209]
mean value: 0.5764704745231061
key: train_roc_auc
value: [0.57212766 0.50149031 0.64606676 0.50968703 0.52466056 0.691287
0.56320991 0.6189452 0.67220636 0.59799037]
mean value: 0.5897671157633948
key: test_jcc
value: [0.25978648 0. 0.27433628 0.02631579 0.02597403 0.31730769
0.12820513 0.22352941 0.3253012 0.1625 ]
mean value: 0.17432560125986818
key: train_jcc
value: [0.25983313 0.00298063 0.29869502 0.01937407 0.0504451 0.36312217
0.13150289 0.2431694 0.33844189 0.20251397]
mean value: 0.1910078272449885
MCC on Blind test: 0.09
Accuracy on Blind test: 0.56
Running classifier: 17
Model_name: QDA
Model func: QuadraticDiscriminantAnalysis()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', QuadraticDiscriminantAnalysis())])
key: fit_time
value: [0.06411815 0.06341171 0.06504941 0.06490445 0.06702185 0.06490684
0.06440043 0.06435442 0.06607866 0.06633019]
mean value: 0.06505761146545411
key: score_time
value: [0.0163157 0.01486492 0.01475978 0.0146873 0.01471829 0.01469731
0.01655936 0.01486897 0.01501822 0.01515818]
mean value: 0.015164804458618165
key: test_mcc
value: [ 0.11302502 0.12180317 0.006158 0.06318476 0.12600316 0.03813986
0.04005786 0.11688592 -0.02798041 0.03740577]
mean value: 0.06346831107285812
key: train_mcc
value: [0.10972268 0.10869065 0.11426913 0.11226759 0.1081715 0.10972268
0.11383575 0.10926942 0.11433364 0.11383575]
mean value: 0.11141187872083935
key: test_fscore
value: [0.390625 0.39267016 0.37467018 0.38341969 0.39370079 0.37994723
0.37726098 0.38845144 0.36745407 0.37696335]
mean value: 0.38251628923453385
key: train_fscore
value: [0.38853503 0.38831019 0.38955007 0.38909829 0.38819786 0.38853503
0.38979118 0.3887764 0.38990426 0.38979118]
mean value: 0.38904894971093584
key: test_precision
value: [0.24271845 0.24429967 0.23355263 0.23794212 0.24509804 0.23684211
0.23322684 0.24104235 0.22801303 0.23376623]
mean value: 0.23765014645330998
key: train_precision
value: [0.24110672 0.24093357 0.24188897 0.24154068 0.24084709 0.24110672
0.24207493 0.24129264 0.24216216 0.24207493]
mean value: 0.24150284070038971
key: test_recall
value: [1. 1. 0.94666667 0.98666667 1. 0.96
0.98648649 1. 0.94594595 0.97297297]
mean value: 0.9798738738738739
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
[0.27329193 0.27950311 0.26397516 0.26086957 0.2826087 0.27018634
0.24922118 0.2741433 0.24922118 0.25856698]
mean value: 0.2661587430583774
key: train_accuracy
value: [0.27021424 0.26952315 0.27332412 0.27194195 0.26917761 0.27021424
0.27322971 0.2701209 0.27357513 0.27322971]
mean value: 0.27145507410364844
key: test_roc_auc
value: [0.52631579 0.53036437 0.50167341 0.51357625 0.53238866 0.51036437
0.50741328 0.52834008 0.49321589 0.50875369]
mean value: 0.5152405806616333
key: train_roc_auc
value: [0.52496626 0.52451642 0.52699055 0.52609087 0.5242915 0.52496626
0.52676563 0.52474134 0.52699055 0.52676563]
mean value: 0.5257085020242915
key: test_jcc
value: [0.24271845 0.24429967 0.23051948 0.23717949 0.24509804 0.23452769
0.23248408 0.24104235 0.22508039 0.23225806]
mean value: 0.2365207687158327
key: train_jcc
value: [0.24110672 0.24093357 0.24188897 0.24154068 0.24084709 0.24110672
0.24207493 0.24129264 0.24216216 0.24207493]
mean value: 0.24150284070038971
MCC on Blind test: 0.14
Accuracy on Blind test: 0.49
Running classifier: 18
Model_name: Random Forest
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model',
RandomForestClassifier(n_estimators=1000, random_state=42))])
key: fit_time
value: [9.38585162 9.20673943 9.24962592 9.16896152 9.24234509 9.27596521
9.31276107 9.54093075 9.27840185 9.20527101]
mean value: 9.286685347557068
key: score_time
value: [0.14280963 0.14655662 0.14437962 0.14620924 0.14272118 0.13551211
0.13642406 0.14215612 0.18832541 0.13576365]
mean value: 0.1460857629776001
key: test_mcc
value: [0.54249706 0.44286264 0.457329 0.48064438 0.40886675 0.45348537
0.52046504 0.47565971 0.37488434 0.47569604]
mean value: 0.46323903362752905
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.54716981 0.47169811 0.5045045 0.52252252 0.50406504 0.544
0.55855856 0.53097345 0.42990654 0.55737705]
mean value: 0.517077559332813
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.93548387 0.80645161 0.77777778 0.80555556 0.64583333 0.68
0.83783784 0.76923077 0.6969697 0.70833333]
mean value: 0.7663473787909272
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.38666667 0.33333333 0.37333333 0.38666667 0.41333333 0.45333333
0.41891892 0.40540541 0.31081081 0.45945946]
mean value: 0.3941261261261262
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.85093168 0.82608696 0.82919255 0.83540373 0.81055901 0.82298137
0.84735202 0.83489097 0.80996885 0.8317757 ]
mean value: 0.8299142818443915
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: [0.68928475 0.65452092 0.67047233 0.67916329 0.67225371 0.694278
0.69731371 0.68448408 0.63516249 0.70138965]
mean value: 0.6778322938322938
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.37662338 0.30864198 0.3373494 0.35365854 0.33695652 0.37362637
0.3875 0.36144578 0.27380952 0.38636364]
mean value: 0.349597512477894
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: 0.12
Accuracy on Blind test: 0.57
Running classifier: 19
Model_name: Random Forest2
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
n_estimators=1000, n_jobs=10, oob_score=True,
random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model',
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
n_estimators=1000, n_jobs=10,
oob_score=True, random_state=42))])
key: fit_time
value: [2.00493169 2.02980089 1.99658084 2.04834533 1.85356545 2.01188207
2.05266428 2.10134959 2.03474379 2.05019808]
mean value: 2.0184062004089354
key: score_time
value: [0.34803176 0.36303425 0.33005548 0.31672502 0.35083318 0.37642264
0.37759686 0.36037397 0.3167913 0.36850858]
mean value: 0.3508373022079468
key: test_mcc
value: [0.51959991 0.41693927 0.44298701 0.46850555 0.42870467 0.49482773
0.48630378 0.42359427 0.4087245 0.49389012]
mean value: 0.4584076801393969
key: train_mcc
value: [0.73473976 0.73029199 0.7449306 0.73877781 0.74655316 0.73115922
0.74545522 0.75213411 0.73912865 0.74737705]
mean value: 0.7410547577443618
key: test_fscore
value: [0.51923077 0.44230769 0.45098039 0.50909091 0.5 0.55172414
0.53571429 0.46728972 0.44230769 0.56666667]
mean value: 0.498531226503208
key: train_fscore
value: [0.75824176 0.75412844 0.77034358 0.76320583 0.7702089 0.75571821
0.77034358 0.77617329 0.76363636 0.77297297]
mean value: 0.7654972917905997
key: test_precision
value: [0.93103448 0.79310345 0.85185185 0.8 0.70731707 0.7804878
0.78947368 0.75757576 0.76666667 0.73913043]
mean value: 0.7916641204170674
key: train_precision
value: [0.98337292 0.98090692 0.97931034 0.98126464 0.98604651 0.97867299
0.98156682 0.98623853 0.98130841 0.97945205]
mean value: 0.9818140140492142
key: test_recall
value: [0.36 0.30666667 0.30666667 0.37333333 0.38666667 0.42666667
0.40540541 0.33783784 0.31081081 0.45945946]
mean value: 0.3673513513513514
key: train_recall
value: [0.61698957 0.61251863 0.63487332 0.62444113 0.6318927 0.61549925
0.63392857 0.63988095 0.625 0.63839286]
mean value: 0.6273416986019444
key: test_accuracy
value: [0.8447205 0.81987578 0.82608696 0.83229814 0.81987578 0.83850932
0.83800623 0.82242991 0.81931464 0.83800623]
mean value: 0.8299123468973123
key: train_accuracy
value: [0.90877678 0.90739461 0.9122322 0.91015895 0.91257775 0.90774015
0.91226252 0.91433506 0.91018998 0.91295337]
mean value: 0.9108621374936889
key: test_roc_auc
value: [0.67595142 0.64118758 0.64523617 0.67249663 0.66904184 0.69511471
0.68650837 0.65272459 0.64123536 0.70543823]
mean value: 0.6684934894408578
key: train_roc_auc
value: [0.80692033 0.80445994 0.81541237 0.8104212 0.81459682 0.80572534
0.81516492 0.81859095 0.81070063 0.81717214]
mean value: 0.8119164633360599
key: test_jcc
value: [0.35064935 0.28395062 0.29113924 0.34146341 0.33333333 0.38095238
0.36585366 0.30487805 0.28395062 0.39534884]
mean value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
0.33315194991698166
key: train_jcc
value: [0.61061947 0.60530191 0.62647059 0.61708395 0.62629247 0.60735294
0.62647059 0.63421829 0.61764706 0.62995595]
mean value: 0.6201413210045505
MCC on Blind test: 0.1
Accuracy on Blind test: 0.56
Running classifier: 20
Model_name: Ridge Classifier
Model func: RidgeClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', RidgeClassifier(random_state=42))])
key: fit_time
value: [0.04376912 0.03021073 0.05061078 0.03073382 0.03069043 0.0511291
0.04524422 0.04531503 0.04949951 0.03963804]
mean value: 0.041684079170227054
key: score_time
value: [0.01299667 0.0131259 0.01422453 0.01323485 0.01313543 0.02087951
0.02107191 0.02088737 0.02101541 0.02090383]
mean value: 0.017147541046142578
key: test_mcc
value: [0.44653117 0.36054864 0.44490067 0.44653117 0.45597949 0.38823762
0.43800019 0.40465381 0.41041911 0.38935548]
mean value: 0.41851573611728093
key: train_mcc
value: [0.45780627 0.46502914 0.45471615 0.45414224 0.46862204 0.46460044
0.4611309 0.45853121 0.46608388 0.46319007]
mean value: 0.4613852341776793
key: test_fscore
value: [0.5 0.42201835 0.49090909 0.5 0.53333333 0.48780488
0.53225806 0.46846847 0.45283019 0.496 ]
mean value: 0.48836223725789
key: train_fscore
value: [0.51988361 0.52478134 0.51930502 0.51322233 0.53128008 0.52292683
0.52274927 0.52399232 0.52662149 0.52692308]
mean value: 0.5231685370016435
key: test_precision
value: [0.75675676 0.67647059 0.77142857 0.75675676 0.71111111 0.625
0.66 0.7027027 0.75 0.60784314]
mean value: 0.7018069624246095
key: train_precision
value: [0.74444444 0.75418994 0.7369863 0.74857143 0.75 0.75706215
0.74792244 0.73783784 0.7534626 0.74456522]
mean value: 0.7475042362192859
key: test_recall
value: [0.37333333 0.30666667 0.36 0.37333333 0.42666667 0.4
0.44594595 0.35135135 0.32432432 0.41891892]
mean value: 0.3780540540540541
key: train_recall
value: [0.39940387 0.4023845 0.40089419 0.390462 0.41132638 0.39940387
0.40178571 0.40625 0.4047619 0.4077381 ]
mean value: 0.40244105279965936
key: test_accuracy
value: [0.82608696 0.80434783 0.82608696 0.82608696 0.82608696 0.80434783
0.81931464 0.81619938 0.81931464 0.80373832]
mean value: 0.8171610456454017
key: train_accuracy
value: [0.82895646 0.83102972 0.82791983 0.82826538 0.8317208 0.83102972
0.82970639 0.82867012 0.83108808 0.83005181]
mean value: 0.8298438314993918
key: test_roc_auc
value: [0.66844804 0.63106613 0.66380567 0.66844804 0.68701754 0.66356275
0.68856002 0.65340847 0.64596783 0.66897363]
mean value: 0.6639258124521282
key: train_roc_auc
value: [0.67900918 0.68139918 0.67885465 0.67543793 0.68497043 0.68035871
0.68042502 0.68130764 0.68236296 0.68272645]
mean value: 0.680685213759254
key: test_jcc
value: [0.33333333 0.26744186 0.3253012 0.33333333 0.36363636 0.32258065
0.36263736 0.30588235 0.29268293 0.32978723]
mean value: 0.32366166171990746
key: train_jcc
value: [0.35124509 0.35573123 0.35071708 0.34519104 0.36173001 0.35402906
0.35386632 0.3550065 0.35742444 0.35770235]
mean value: 0.3542643116567098
MCC on Blind test: 0.18
Accuracy on Blind test: 0.59
Running classifier: 21
Model_name: Ridge ClassifierCV
Model func: RidgeClassifierCV(cv=3)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', RidgeClassifierCV(cv=3))])
key: fit_time
value: [0.19730258 0.21396136 0.22605848 0.20361185 0.14368963 0.16601181
0.13597369 0.16187048 0.19390249 0.20522976]
mean value: 0.1847612142562866
key: score_time
value: [0.02176952 0.02164769 0.02040291 0.02041316 0.01316285 0.01319599
0.01312375 0.01315975 0.02045298 0.02036095]
mean value: 0.01776895523071289
key: test_mcc
value: [0.44653117 0.36054864 0.44490067 0.44653117 0.45597949 0.38823762
0.43800019 0.40465381 0.41041911 0.38935548]
mean value: 0.41851573611728093
key: train_mcc
value: [0.45780627 0.46502914 0.45471615 0.46284811 0.46862204 0.46460044
0.4611309 0.45853121 0.46608388 0.46319007]
mean value: 0.46225582177317665
key: test_fscore
value: [0.5 0.42201835 0.49090909 0.5 0.53333333 0.48780488
0.53225806 0.46846847 0.45283019 0.496 ]
mean value: 0.48836223725789
key: train_fscore
value: [0.51988361 0.52478134 0.51930502 0.52783109 0.53128008 0.52292683
0.52274927 0.52399232 0.52662149 0.52692308]
mean value: 0.5246294133018348
key: test_precision
value: [0.75675676 0.67647059 0.77142857 0.75675676 0.71111111 0.625
0.66 0.7027027 0.75 0.60784314]
mean value: 0.7018069624246095
key: train_precision
value: [0.74444444 0.75418994 0.7369863 0.74123989 0.75 0.75706215
0.74792244 0.73783784 0.7534626 0.74456522]
mean value: 0.7467710825804719
key: test_recall
value: [0.37333333 0.30666667 0.36 0.37333333 0.42666667 0.4
0.44594595 0.35135135 0.32432432 0.41891892]
mean value: 0.3780540540540541
key: train_recall
value: [0.39940387 0.4023845 0.40089419 0.40983607 0.41132638 0.39940387
0.40178571 0.40625 0.4047619 0.4077381 ]
mean value: 0.40437845965509894
key: test_accuracy
value: [0.82608696 0.80434783 0.82608696 0.82608696 0.82608696 0.80434783
0.81931464 0.81619938 0.81931464 0.80373832]
mean value: 0.8171610456454017
key: train_accuracy
value: [0.82895646 0.83102972 0.82791983 0.82999309 0.8317208 0.83102972
0.82970639 0.82867012 0.83108808 0.83005181]
mean value: 0.8300166027502556
key: test_roc_auc
value: [0.66844804 0.63106613 0.66380567 0.66844804 0.68701754 0.66356275
0.68856002 0.65340847 0.64596783 0.66897363]
mean value: 0.6639258124521282
key: train_roc_auc
value: [0.67900918 0.68139918 0.67885465 0.68332559 0.68497043 0.68035871
0.68042502 0.68130764 0.68236296 0.68272645]
mean value: 0.6814739801649315
key: test_jcc
value: [0.33333333 0.26744186 0.3253012 0.33333333 0.36363636 0.32258065
0.36263736 0.30588235 0.29268293 0.32978723]
mean value: 0.32366166171990746
key: train_jcc
value: [0.35124509 0.35573123 0.35071708 0.35853977 0.36173001 0.35402906
0.35386632 0.3550065 0.35742444 0.35770235]
mean value: 0.355599184104331
MCC on Blind test: 0.18
Accuracy on Blind test: 0.59
Running classifier: 22
Model_name: SVC
Model func: SVC(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', SVC(random_state=42))])
key: fit_time
value: [0.40994453 0.39608026 0.38155746 0.39474964 0.40328407 0.39687109
0.40119815 0.38227057 0.39209461 0.39678717]
mean value: 0.3954837560653687
key: score_time
value: [0.10240912 0.09279513 0.08923125 0.10023069 0.09933376 0.10402298
0.09787536 0.08846259 0.10363722 0.10497689]
mean value: 0.09829750061035156
key: test_mcc
value: [0.37222906 0.19573104 0.37222906 0.31918781 0.25195916 0.32987714
0.26689742 0.31793128 0.28249553 0.35506869]
mean value: 0.3063606191418288
key: train_mcc
value: [0.37883236 0.39147394 0.3778893 0.38496329 0.38752673 0.3832007
0.38008832 0.3849479 0.37794904 0.3906017 ]
mean value: 0.38374732681330104
key: test_fscore
value: [0.29545455 0.12345679 0.29545455 0.34343434 0.3 0.30434783
0.26373626 0.31914894 0.26666667 0.39215686]
mean value: 0.2903856779872089
key: train_fscore
value: [0.34730539 0.36449704 0.35349941 0.34698795 0.35406699 0.35629454
0.34886499 0.3547619 0.35 0.36705882]
mean value: 0.35433370341097303
key: test_precision
value: [1. 0.83333333 1. 0.70833333 0.6 0.82352941
0.70588235 0.75 0.75 0.71428571]
mean value: 0.7885364145658265
key: train_precision
value: [0.88414634 0.88505747 0.86627907 0.90566038 0.8969697 0.87719298
0.88484848 0.88690476 0.875 0.87640449]
mean value: 0.8838463680414822
key: test_recall
value: [0.17333333 0.06666667 0.17333333 0.22666667 0.2 0.18666667
0.16216216 0.2027027 0.16216216 0.27027027]
mean value: 0.18239639639639643
key: train_recall
value: [0.21609538 0.2295082 0.22205663 0.21460507 0.22056632 0.22354694
0.2172619 0.22172619 0.21875 0.23214286]
mean value: 0.22162594918742462
key: test_accuracy
value: [0.80745342 0.77950311 0.80745342 0.79813665 0.7826087 0.80124224
0.79127726 0.80062305 0.79439252 0.80685358]
mean value: 0.7969543932973433
key: train_accuracy
value: [0.81167934 0.81444368 0.81167934 0.81271596 0.81340705 0.81271596
0.81174439 0.81278066 0.81139896 0.81416235]
mean value: 0.8126727682669044
key: test_roc_auc
value: [0.58666667 0.53130904 0.58666667 0.59916329 0.57975709 0.58726046
0.57095962 0.59122989 0.57298392 0.6189408 ]
mean value: 0.5824937447569026
key: train_roc_auc
value: [0.60377419 0.61025567 0.60585513 0.60392871 0.6064595 0.60705013
0.60435745 0.60658959 0.60465165 0.61112316]
mean value: 0.6064045175536764
key: test_jcc
value: [0.17333333 0.06578947 0.17333333 0.20731707 0.17647059 0.17948718
0.15189873 0.18987342 0.15384615 0.24390244]
mean value: 0.17152517260133607
key: train_jcc
value: [0.21014493 0.22286541 0.21469741 0.20991254 0.21511628 0.21676301
0.21128799 0.21562952 0.21212121 0.22478386]
mean value: 0.21533221522618
MCC on Blind test: 0.14
Accuracy on Blind test: 0.57
Running classifier: 23
Model_name: Stochastic GDescent
Model func: SGDClassifier(n_jobs=10, random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', SGDClassifier(n_jobs=10, random_state=42))])
key: fit_time
value: [0.1034801 0.10629749 0.08357286 0.10743833 0.10230422 0.09953213
0.10263681 0.09478331 0.11108303 0.09105873]
mean value: 0.10021870136260987
key: score_time
value: [0.0118773 0.01131654 0.0115478 0.01110339 0.01109219 0.01120925
0.01116109 0.01126146 0.01935029 0.01192021]
mean value: 0.012183952331542968
key: test_mcc
value: [0.39769091 0.4667411 0.36031408 0.37604057 0.10128971 0.24591617
0.10213101 0.317177 0.39887997 0.32111686]
mean value: 0.30872973711131735
key: train_mcc
value: [0.45686993 0.45329635 0.3828763 0.39238603 0.12690427 0.27048476
0.16789617 0.45297546 0.39095147 0.397736 ]
mean value: 0.3492376741862349
key: test_fscore
value: [0.45454545 0.60227273 0.35416667 0.3960396 0.02631579 0.24175824
0.02666667 0.40707965 0.54954955 0.50224215]
mean value: 0.3560636498377454
key: train_fscore
value: [0.52661597 0.5926373 0.41675618 0.40223464 0.04087591 0.21558442
0.08498584 0.50984252 0.54413893 0.55222337]
mean value: 0.3885895062638413
key: test_precision
value: [0.71428571 0.52475248 0.80952381 0.76923077 1. 0.6875
1. 0.58974359 0.41216216 0.37583893]
mean value: 0.6883037446368067
key: train_precision
value: [0.72703412 0.4979716 0.74615385 0.80357143 1. 0.83838384
0.88235294 0.75290698 0.40256959 0.42313788]
mean value: 0.7074082223733196
key: test_recall
value: [0.33333333 0.70666667 0.22666667 0.26666667 0.01333333 0.14666667
0.01351351 0.31081081 0.82432432 0.75675676]
mean value: 0.3598738738738739
key: train_recall
value: [0.41281669 0.73174367 0.28912072 0.26825633 0.02086438 0.12369598
0.04464286 0.38541667 0.83928571 0.79464286]
mean value: 0.3910485859768647
key: test_accuracy
value: [0.8136646 0.7826087 0.80745342 0.81055901 0.77018634 0.78571429
0.77258567 0.79127726 0.68847352 0.65420561]
mean value: 0.7676728391478493
key: train_accuracy
value: [0.82791983 0.76675881 0.81237042 0.81513476 0.77297858 0.79129233
0.77685665 0.82797927 0.67357513 0.70086356]
mean value: 0.7765729345331237
key: test_roc_auc
value: [0.64642375 0.75616734 0.60523617 0.62118758 0.50666667 0.56321188
0.50675676 0.62301674 0.7360488 0.69011927]
mean value: 0.6254834956413904
key: train_roc_auc
value: [0.68301653 0.7545358 0.62971555 0.62423163 0.51043219 0.55824925
0.52142174 0.67359002 0.73138375 0.73357874]
mean value: 0.6420155210586078
key: test_jcc
value: [0.29411765 0.43089431 0.21518987 0.24691358 0.01333333 0.1375
0.01351351 0.25555556 0.37888199 0.33532934]
mean value: 0.23212291409639557
key: train_jcc
value: [0.35741935 0.42109777 0.26322931 0.25174825 0.02086438 0.12081514
0.0443787 0.34214003 0.37375746 0.38142857]
mean value: 0.25768789558911614
MCC on Blind test: 0.05
Accuracy on Blind test: 0.55
Running classifier: 24
Model_name: XGBoost
Model func: /home/tanu/git/LSHTM_analysis/scripts/ml/ml_functions/MultClfs_logo_skf.py:419: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
scoresDF_CV['source_data'] = 'CV'
/home/tanu/git/LSHTM_analysis/scripts/ml/ml_functions/MultClfs_logo_skf.py:446: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
scoresDF_BT['source_data'] = 'BT'
XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
colsample_bynode=None, colsample_bytree=None,
enable_categorical=False, gamma=None, gpu_id=None,
importance_type=None, interaction_constraints=None,
learning_rate=None, max_delta_step=None, max_depth=None,
min_child_weight=None, missing=nan, monotone_constraints=None,
n_estimators=100, n_jobs=None, num_parallel_tree=None,
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
scale_pos_weight=None, subsample=None, tree_method=None,
use_label_encoder=False, validate_parameters=None, verbosity=0)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', '...
interaction_constraints=None, learning_rate=None,
max_delta_step=None, max_depth=None,
min_child_weight=None, missing=nan,
monotone_constraints=None, n_estimators=100,
n_jobs=None, num_parallel_tree=None,
predictor=None, random_state=42, reg_alpha=None,
reg_lambda=None, scale_pos_weight=None,
subsample=None, tree_method=None,
use_label_encoder=False,
validate_parameters=None, verbosity=0))])
key: fit_time
value: [0.39725995 0.39528394 0.37187934 0.53835416 0.39634657 0.38194132
0.38941503 0.38354206 0.54003143 0.38880777]
mean value: 0.4182861566543579
key: score_time
value: [0.01200819 0.01216793 0.01218033 0.01233697 0.01218081 0.01222277
0.01223111 0.01241279 0.01223922 0.01203609]
mean value: 0.012201619148254395
key: test_mcc
value: [0.57520442 0.47485004 0.46287645 0.51434489 0.469056 0.46643181
0.47015971 0.38660377 0.47965091 0.46953365]
mean value: 0.47687116604782964
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.64 0.54237288 0.52991453 0.592 0.57971014 0.58156028
0.57777778 0.5 0.54700855 0.56923077]
mean value: 0.5659574933903035
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.8 0.74418605 0.73809524 0.74 0.63492063 0.62121212
0.63934426 0.59259259 0.74418605 0.66071429]
mean value: 0.6915251227853211
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.53333333 0.42666667 0.41333333 0.49333333 0.53333333 0.54666667
0.52702703 0.43243243 0.43243243 0.5 ]
mean value: 0.48385585585585583
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.86024845 0.83229814 0.82919255 0.84161491 0.81987578 0.81677019
0.82242991 0.80062305 0.83489097 0.82554517]
mean value: 0.8283489096573209
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: [0.74642375 0.69106613 0.68439946 0.72035088 0.72010796 0.72272605
0.7189791 0.6716818 0.69394901 0.71153846]
mean value: 0.7081222599117336
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.47058824 0.37209302 0.36046512 0.42045455 0.40816327 0.41
0.40625 0.33333333 0.37647059 0.39784946]
mean value: 0.3955667569523888
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: 0.11
Accuracy on Blind test: 0.57
Extracting tts_split_name: logo_skf_BT_katg
Total cols in each df:
CV df: 8
metaDF: 17
Adding column: Model_name
Total cols in bts df:
BT_df: 8
First proceeding to rowbind CV and BT dfs:
Final output should have: 25 columns
Combinig 2 using pd.concat by row ~ rowbind
Checking Dims of df to combine:
Dim of CV: (24, 8)
Dim of BT: (24, 8)
8
Number of Common columns: 8
These are: ['Precision', 'Accuracy', 'source_data', 'F1', 'Recall', 'MCC', 'ROC_AUC', 'JCC']
Concatenating dfs with different resampling methods [WF]:
Split type: logo_skf_BT_katg
No. of dfs combining: 2
PASS: 2 dfs successfully combined
nrows in combined_df_wf: 48
ncols in combined_df_wf: 8
PASS: proceeding to merge metadata with CV and BT dfs
Adding column: Model_name
=========================================================
SUCCESS: Ran multiple classifiers
=======================================================
BTS gene: rpob
Total genes: 6
Training on: 5
Training on genes: ['alr', 'katg', 'pnca', 'gid', 'embb']
Omitted genes: ['rpob']
Blind test gene: rpob
/home/tanu/git/Data/ml_combined/6genes_logo_skf_BT_rpob.csv
Training data dim: (2901, 171)
Training Target dim: (2901,)
Checked training df does NOT have Target var
TEST data dim: (1132, 171)
TEST Target dim: (1132,)
==============================================================
Running several classification models (n): 24
List of models:
('AdaBoost Classifier', AdaBoostClassifier(random_state=42))
('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42))
('Decision Tree', DecisionTreeClassifier(random_state=42))
('Extra Tree', ExtraTreeClassifier(random_state=42))
('Extra Trees', ExtraTreesClassifier(random_state=42))
('Gradient Boosting', GradientBoostingClassifier(random_state=42))
('Gaussian NB', GaussianNB())
('Gaussian Process', GaussianProcessClassifier(random_state=42))
('K-Nearest Neighbors', KNeighborsClassifier())
('LDA', LinearDiscriminantAnalysis())
('Logistic Regression', LogisticRegression(random_state=42))
('Logistic RegressionCV', LogisticRegressionCV(cv=3, random_state=42))
('MLP', MLPClassifier(max_iter=500, random_state=42))
('Multinomial', MultinomialNB())
('Naive Bayes', BernoulliNB())
('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42))
('QDA', QuadraticDiscriminantAnalysis())
('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42))
('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
n_estimators=1000, n_jobs=10, oob_score=True,
random_state=42))
('Ridge Classifier', RidgeClassifier(random_state=42))
('Ridge ClassifierCV', RidgeClassifierCV(cv=3))
('SVC', SVC(random_state=42))
('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42))
('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
colsample_bynode=None, colsample_bytree=None,
enable_categorical=False, gamma=None, gpu_id=None,
importance_type=None, interaction_constraints=None,
learning_rate=None, max_delta_step=None, max_depth=None,
min_child_weight=None, missing=nan, monotone_constraints=None,
n_estimators=100, n_jobs=None, num_parallel_tree=None,
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
scale_pos_weight=None, subsample=None, tree_method=None,
use_label_encoder=False, validate_parameters=None, verbosity=0))
================================================================
Running classifier: 1
Model_name: AdaBoost Classifier
Model func: AdaBoostClassifier(random_state=42)
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', AdaBoostClassifier(random_state=42))])
key: fit_time
value: [0.75381494 0.75609851 0.65429807 0.67664051 0.67045712 0.70102692
0.65799236 0.65299225 0.65367532 0.64182711]
mean value: 0.6818823099136353
key: score_time
value: [0.0184412 0.01982999 0.01942301 0.02010179 0.02023053 0.01842284
0.01919365 0.01820827 0.01829147 0.01823354]
mean value: 0.019037628173828126
key: test_mcc
value: [0.48892964 0.27458647 0.44974636 0.50231453 0.49791486 0.46221765
0.39261138 0.3856817 0.41851276 0.40277165]
mean value: 0.4275287010390735
key: train_mcc
value: [0.52265403 0.53716682 0.5078412 0.50997562 0.53290634 0.52510104
0.52992609 0.52625461 0.49535555 0.54556193]
mean value: 0.5232743223650863
key: test_fscore
value: [0.61111111 0.43283582 0.58333333 0.60740741 0.59541985 0.59310345
0.54421769 0.54054054 0.54411765 0.54545455]
mean value: 0.559754138848022
key: train_fscore
value: [0.63615206 0.6453125 0.61625101 0.61786002 0.63924051 0.63464567
0.63772691 0.63693271 0.6092504 0.65210608]
mean value: 0.6325477857492051
key: test_precision
value: [0.67692308 0.51785714 0.63636364 0.71929825 0.73584906 0.65151515
0.58823529 0.57971014 0.64912281 0.609375 ]
mean value: 0.6364249555939543
key: train_precision
value: [0.70446735 0.72202797 0.71588785 0.71775701 0.72661871 0.71580817
0.72142857 0.71278459 0.69835466 0.72695652]
mean value: 0.7162091404744638
key: test_recall
value: [0.55696203 0.37179487 0.53846154 0.52564103 0.5 0.5443038
0.50632911 0.50632911 0.46835443 0.49367089]
mean value: 0.5011846802986044
key: train_recall
value: [0.57991513 0.58333333 0.54096045 0.54237288 0.57062147 0.57001414
0.57142857 0.57567185 0.54031117 0.59123055]
mean value: 0.5665859564164648
key: test_accuracy
value: [0.80756014 0.73793103 0.79310345 0.81724138 0.81724138 0.79655172
0.76896552 0.76551724 0.7862069 0.77586207]
mean value: 0.7866180827112217
key: train_accuracy
value: [0.82030651 0.82612026 0.81731137 0.81807736 0.82535427 0.82229031
0.82420529 0.82229031 0.81233244 0.82918422]
mean value: 0.8217472350254083
key: test_roc_auc
value: [0.72895271 0.62221819 0.712627 0.72508466 0.71698113 0.71764953
0.68681385 0.68444418 0.68678385 0.68759374]
mean value: 0.6969148832951485
key: train_roc_auc
value: [0.74476576 0.74989052 0.73054328 0.73151224 0.74537379 0.74299027
0.7447479 0.7447687 0.72682576 0.75438628]
mean value: 0.7415804511059569
key: test_jcc
value: [0.44 0.27619048 0.41176471 0.43617021 0.42391304 0.42156863
0.37383178 0.37037037 0.37373737 0.375 ]
mean value: 0.3902546585576706
key: train_jcc
value: [0.46643914 0.47635525 0.44534884 0.44703143 0.46976744 0.46482122
0.46813441 0.46727899 0.43807339 0.4837963 ]
mean value: 0.4627046412227413
MCC on Blind test: 0.27
Accuracy on Blind test: 0.74
Running classifier: 2
Model_name: Bagging Classifier
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model',
BaggingClassifier(n_jobs=10, oob_score=True,
random_state=42))])
key: fit_time
value: [0.33121586 0.38014746 0.37260795 0.37865472 0.37736058 0.35474944
0.35080218 0.38265324 0.36514544 0.3667109 ]
mean value: 0.3660047769546509
key: score_time
value: [0.04646015 0.0468049 0.028404 0.03769946 0.04612613 0.02501464
0.02867436 0.04703593 0.04764962 0.0462153 ]
mean value: 0.04000844955444336
key: test_mcc
value: [0.42500102 0.34907601 0.34490045 0.47779289 0.52424076 0.39548697
0.50139589 0.33305325 0.45217864 0.41209555]
mean value: 0.4215221431553339
key: train_mcc
value: [0.96012252 0.95633685 0.94753117 0.95530521 0.94166195 0.95351288
0.95254519 0.95523913 0.96115067 0.95331875]
mean value: 0.9536724330319807
key: test_fscore
value: [0.52380952 0.47244094 0.47692308 0.58015267 0.59677419 0.52238806
0.61428571 0.48175182 0.56060606 0.53030303]
mean value: 0.5359435100632418
key: train_fscore
value: [0.9703543 0.96732026 0.9606414 0.96666667 0.95614035 0.96486091
0.96410256 0.96666667 0.97101449 0.96511628]
mean value: 0.9652883890992687
key: test_precision
value: [0.70212766 0.6122449 0.59615385 0.71698113 0.80434783 0.63636364
0.70491803 0.56896552 0.69811321 0.66037736]
mean value: 0.6700593114279563
key: train_precision
value: [0.99260355 0.9955157 0.99246988 0.99255952 0.99090909 1.
1. 0.9910847 0.99554235 0.99252616]
mean value: 0.9943210941135889
key: test_recall
value: [0.41772152 0.38461538 0.3974359 0.48717949 0.47435897 0.44303797
0.5443038 0.41772152 0.46835443 0.44303797]
mean value: 0.44777669587796165
key: train_recall
value: [0.94908062 0.94067797 0.93079096 0.9420904 0.92372881 0.9321075
0.93069307 0.94342291 0.9476662 0.93917963]
mean value: 0.9379438064871863
key: test_accuracy
value: [0.79381443 0.76896552 0.76551724 0.81034483 0.82758621 0.77931034
0.8137931 0.75517241 0.8 0.7862069 ]
mean value: 0.7900710984713829
key: train_accuracy
value: [0.98429119 0.98276522 0.97931827 0.98238223 0.9770203 0.98161624
0.98123324 0.98238223 0.9846802 0.98161624]
mean value: 0.9817305358555244
key: test_roc_auc
value: [0.67584189 0.64749637 0.64918965 0.70821239 0.71595307 0.67412562
0.72949787 0.64961905 0.69626252 0.67886496]
mean value: 0.6825063395333651
key: train_roc_auc
value: [0.9732266 0.96955075 0.96408177 0.96973148 0.96028795 0.96605375
0.96534653 0.97013583 0.97304528 0.96827679]
mean value: 0.9679736728952267
key: test_jcc
value: [0.35483871 0.30927835 0.31313131 0.40860215 0.42528736 0.35353535
0.44329897 0.31730769 0.38947368 0.36082474]
mean value: 0.3675578321577448
key: train_jcc
value: [0.94241573 0.93670886 0.92426367 0.93548387 0.91596639 0.9321075
0.93069307 0.93548387 0.94366197 0.93258427]
mean value: 0.9329369201465754
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/joblib/externals/loky/process_executor.py:702: UserWarning: A worker stopped while some jobs were given to the executor. This can be caused by a too short worker timeout or by a memory leak.
warnings.warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
MCC on Blind test: 0.31
Accuracy on Blind test: 0.75
Running classifier: 3
Model_name: Decision Tree
Model func: DecisionTreeClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', DecisionTreeClassifier(random_state=42))])
key: fit_time
value: [0.20158339 0.16982388 0.16223693 0.17325592 0.17822623 0.19894648
0.18185496 0.18414021 0.18705153 0.17839551]
mean value: 0.18155150413513182
key: score_time
value: [0.0106225 0.01002574 0.01045036 0.01013017 0.01031208 0.01034594
0.01004291 0.01022577 0.01012135 0.01079488]
mean value: 0.01030716896057129
key: test_mcc
value: [0.30626605 0.200885 0.36769307 0.26342525 0.39014168 0.36247844
0.42079907 0.3318156 0.32754366 0.3149541 ]
mean value: 0.32860019263608164
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.48684211 0.42236025 0.5408805 0.46153846 0.55900621 0.53503185
0.58823529 0.52631579 0.50955414 0.50909091]
mean value: 0.5138855509516989
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.50684932 0.40963855 0.5308642 0.46153846 0.54216867 0.53846154
0.54945055 0.48913043 0.51282051 0.48837209]
mean value: 0.5029294331591947
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.46835443 0.43589744 0.55128205 0.46153846 0.57692308 0.53164557
0.63291139 0.56962025 0.50632911 0.53164557]
mean value: 0.526614735475495
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.73195876 0.67931034 0.74827586 0.71034483 0.75517241 0.74827586
0.75862069 0.72068966 0.73448276 0.72068966]
mean value: 0.7307820831852115
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: [0.64927155 0.60238268 0.68601838 0.63171263 0.6988389 0.68051473
0.7192993 0.67343572 0.66311716 0.66155738]
mean value: 0.6666148433704041
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.32173913 0.26771654 0.37068966 0.3 0.38793103 0.36521739
0.41666667 0.35714286 0.34188034 0.34146341]
mean value: 0.34704470271513854
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: 0.2
Accuracy on Blind test: 0.7
Running classifier: 4
Model_name: Extra Tree
Model func: ExtraTreeClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', ExtraTreeClassifier(random_state=42))])
key: fit_time
value: [0.0203557 0.02114511 0.02056384 0.02062964 0.02059078 0.02092385
0.02053761 0.02049851 0.02074099 0.02064013]
mean value: 0.020662617683410645
key: score_time
value: [0.0097847 0.00978279 0.00979662 0.00972199 0.00979948 0.00976944
0.00978994 0.00979877 0.00981903 0.00987411]
mean value: 0.009793686866760253
key: test_mcc
value: [0.18982739 0.14293815 0.14593974 0.19880545 0.21641497 0.13178036
0.41181068 0.24712143 0.09126307 0.15035376]
mean value: 0.19262549961903314
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.4125 0.36241611 0.38787879 0.41290323 0.42580645 0.37267081
0.57668712 0.46060606 0.32894737 0.37086093]
mean value: 0.41112768528779575
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.40740741 0.38028169 0.36781609 0.41558442 0.42857143 0.36585366
0.55952381 0.44186047 0.34246575 0.38888889]
mean value: 0.409825360914834
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.41772152 0.34615385 0.41025641 0.41025641 0.42307692 0.37974684
0.59493671 0.48101266 0.3164557 0.35443038]
mean value: 0.41340473872119443
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.67697595 0.67241379 0.65172414 0.6862069 0.69310345 0.65172414
0.76206897 0.69310345 0.64827586 0.67241379]
mean value: 0.6808010427775802
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: [0.59565321 0.56930334 0.57541122 0.59899613 0.60776488 0.56665067
0.70979063 0.62676225 0.54448377 0.57294979]
mean value: 0.5967765891585047
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.25984252 0.22131148 0.2406015 0.2601626 0.2704918 0.22900763
0.40517241 0.2992126 0.19685039 0.22764228]
mean value: 0.2610295219688617
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: 0.15
Accuracy on Blind test: 0.67
Running classifier: 5
Model_name: Extra Trees
Model func: ExtraTreesClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', ExtraTreesClassifier(random_state=42))])
key: fit_time
value: [0.46203732 0.46857953 0.45525265 0.46751118 0.46654129 0.46955276
0.4670248 0.44948626 0.46085215 0.45398235]
mean value: 0.46208202838897705
key: score_time
value: [0.0257318 0.0249536 0.0253799 0.02521706 0.02554107 0.02605772
0.0243907 0.02597046 0.02469659 0.02500701]
mean value: 0.02529458999633789
key: test_mcc
value: [0.39688066 0.30705863 0.377183 0.3504116 0.42191806 0.28096325
0.39548697 0.33192939 0.32317879 0.37416437]
mean value: 0.35591747198640267
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.48333333 0.42622951 0.50381679 0.45901639 0.4957265 0.38983051
0.52238806 0.46969697 0.4137931 0.46666667]
mean value: 0.46304978325802837
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.70731707 0.59090909 0.62264151 0.63636364 0.74358974 0.58974359
0.63636364 0.58490566 0.64864865 0.68292683]
mean value: 0.6443409417868691
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.36708861 0.33333333 0.42307692 0.35897436 0.37179487 0.29113924
0.44303797 0.39240506 0.30379747 0.35443038]
mean value: 0.36390782213567024
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.78694158 0.75862069 0.77586207 0.77241379 0.79655172 0.75172414
0.77931034 0.75862069 0.76551724 0.77931034]
mean value: 0.7724872615238771
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: [0.65524242 0.62421384 0.66436865 0.64175133 0.66231253 0.60765493
0.67412562 0.64406983 0.62109305 0.6464095 ]
mean value: 0.6441241694958061
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.31868132 0.27083333 0.33673469 0.29787234 0.32954545 0.24210526
0.35353535 0.30693069 0.26086957 0.30434783]
mean value: 0.30214558419300924
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: 0.26
Accuracy on Blind test: 0.74
Running classifier: 6
Model_name: Gradient Boosting
Model func: GradientBoostingClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', GradientBoostingClassifier(random_state=42))])
key: fit_time
value: [3.02961183 3.01932168 3.02937746 3.03039384 3.02344155 3.03401184
3.02518487 3.03175235 3.02743173 3.03902411]
mean value: 3.0289551258087157
key: score_time
value: [0.01047587 0.01038647 0.01057076 0.01024008 0.01032877 0.0102973
0.0107429 0.01036954 0.01043534 0.01036525]
mean value: 0.010421228408813477
key: test_mcc
value: [0.45804702 0.37820386 0.48018281 0.53724972 0.5582808 0.46978312
0.44374423 0.44374423 0.49885384 0.42987106]
mean value: 0.4697960685148995
key: train_mcc
value: [0.68100017 0.6759307 0.65871872 0.67702047 0.68207394 0.67307968
0.68041949 0.66581259 0.66706572 0.6820528 ]
mean value: 0.6743174276885375
key: test_fscore
value: [0.57352941 0.5248227 0.58646617 0.64285714 0.64122137 0.59722222
0.57931034 0.57931034 0.60869565 0.55474453]
mean value: 0.5888179878715397
key: train_fscore
value: [0.75098814 0.74643423 0.73205742 0.74742676 0.75 0.74198718
0.74658085 0.73743017 0.73926868 0.75158228]
mean value: 0.744375571040439
key: test_precision
value: [0.68421053 0.58730159 0.70909091 0.72580645 0.79245283 0.66153846
0.63636364 0.63636364 0.71186441 0.65517241]
mean value: 0.6800164859348368
key: train_precision
value: [0.85125448 0.85018051 0.84065934 0.85045045 0.85948905 0.85582255
0.86567164 0.84615385 0.84392015 0.85278276]
mean value: 0.851638477668532
key: test_recall
value: [0.49367089 0.47435897 0.5 0.57692308 0.53846154 0.5443038
0.53164557 0.53164557 0.53164557 0.48101266]
mean value: 0.5203667640376501
key: train_recall
value: [0.6718529 0.66525424 0.64830508 0.66666667 0.66525424 0.65487977
0.6562942 0.65346535 0.65770863 0.6718529 ]
mean value: 0.6611533974220667
key: test_accuracy
value: [0.80068729 0.76896552 0.81034483 0.82758621 0.83793103 0.8
0.78965517 0.78965517 0.8137931 0.78965517]
mean value: 0.8028273492119921
key: train_accuracy
value: [0.87931034 0.87744159 0.87131367 0.87782459 0.87973956 0.8766756
0.87935657 0.87399464 0.87437763 0.87973956]
mean value: 0.8769773768803075
key: test_roc_auc
value: [0.70438261 0.67585873 0.71226415 0.7483672 0.74328737 0.7200192
0.70895075 0.70895075 0.72553842 0.69311296]
mean value: 0.714073214800726
key: train_roc_auc
value: [0.81411878 0.81081945 0.8012939 0.81152566 0.8123959 0.80695669
0.80923954 0.80467385 0.80627028 0.81439284]
mean value: 0.8091686885810748
key: test_jcc
value: [0.40206186 0.35576923 0.41489362 0.47368421 0.47191011 0.42574257
0.40776699 0.40776699 0.4375 0.38383838]
mean value: 0.41809339650248106
key: train_jcc
value: [0.60126582 0.5954488 0.57735849 0.59671302 0.6 0.58980892
0.59563543 0.5840708 0.58638083 0.60202788]
mean value: 0.5928709993206569
MCC on Blind test: 0.32
Accuracy on Blind test: 0.75
Running classifier: 7
Model_name: Gaussian NB
Model func: GaussianNB()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', GaussianNB())])
key: fit_time
value: [0.02035785 0.0202806 0.0209403 0.02149963 0.02135372 0.02047777
0.02154589 0.02104974 0.02104235 0.02179003]
mean value: 0.02103378772735596
key: score_time
value: [0.01102161 0.01124048 0.01119471 0.01025963 0.01035738 0.01035094
0.0101788 0.01015615 0.01053905 0.01030302]
mean value: 0.010560178756713867
key: test_mcc
value: [0.26854895 0.20643709 0.18999058 0.31798229 0.22471858 0.43562665
0.34788513 0.22331346 0.27169576 0.29897574]
mean value: 0.27851742341878494
key: train_mcc
value: [0.28696191 0.29396935 0.28221446 0.28268592 0.28142013 0.2797154
0.27292741 0.28082226 0.27616184 0.2907446 ]
mean value: 0.282762327948638
key: test_fscore
value: [0.50241546 0.46445498 0.45098039 0.53 0.47663551 0.60696517
0.55172414 0.48245614 0.50485437 0.51578947]
mean value: 0.5086275636443585
key: train_fscore
value: [0.51256831 0.516977 0.51165254 0.51056911 0.51022605 0.50742983
0.50533049 0.50967742 0.50744681 0.51409619]
mean value: 0.5105973735892858
key: test_precision
value: [0.40625 0.36842105 0.36507937 0.43442623 0.375 0.5
0.4516129 0.36912752 0.40944882 0.44144144]
mean value: 0.41208073275625506
key: train_precision
value: [0.41763134 0.42218247 0.40932203 0.41424802 0.41217391 0.41531532
0.40547476 0.41110147 0.40664962 0.42196007]
mean value: 0.413605902480636
key: test_recall
value: [0.65822785 0.62820513 0.58974359 0.67948718 0.65384615 0.7721519
0.70886076 0.69620253 0.65822785 0.62025316]
mean value: 0.6665206101914962
key: train_recall
value: [0.66336634 0.66666667 0.68220339 0.66525424 0.66949153 0.65205092
0.67043847 0.67043847 0.67468175 0.65770863]
mean value: 0.6672300401953029
key: test_accuracy
value: [0.64604811 0.61034483 0.6137931 0.67586207 0.6137931 0.72758621
0.6862069 0.59310345 0.64827586 0.68275862]
mean value: 0.649777224789667
key: train_accuracy
value: [0.65823755 0.66219839 0.64687859 0.6541555 0.65147453 0.65721946
0.64458062 0.65070854 0.64534661 0.66334738]
mean value: 0.6534147161067749
key: test_roc_auc
value: [0.64986864 0.61598936 0.60619255 0.67700774 0.62645138 0.74152619
0.69329294 0.62535245 0.65138881 0.66320715]
mean value: 0.6550277199218235
key: train_roc_auc
value: [0.65984922 0.66360133 0.6579698 0.65764026 0.65713147 0.65559479
0.65270873 0.65691041 0.65456777 0.6615749 ]
mean value: 0.6577548678391654
key: test_jcc
value: [0.33548387 0.30246914 0.29113924 0.36054422 0.31288344 0.43571429
0.38095238 0.31791908 0.33766234 0.34751773]
mean value: 0.34222857105164034
key: train_jcc
value: [0.34459956 0.34859675 0.34377224 0.34279476 0.34248555 0.3399705
0.33808845 0.34199134 0.33998574 0.34598214]
mean value: 0.3428267036702492
MCC on Blind test: 0.24
Accuracy on Blind test: 0.68
Running classifier: 8
Model_name: Gaussian Process
Model func: GaussianProcessClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', GaussianProcessClassifier(random_state=42))])
key: fit_time
value: [3.06361818 3.17689848 3.06830978 3.06564426 3.05455494 3.10950708
3.10421062 3.26355982 3.10964894 3.27902532]
mean value: 3.129497742652893
key: score_time
value: [0.08739638 0.08807635 0.08785772 0.11058259 0.09864497 0.09826589
0.12053704 0.09489512 0.0876236 0.08755136]
mean value: 0.09614310264587403
key: test_mcc
value: [0.32233002 0.23739067 0.2570065 0.32550068 0.33997938 0.09800313
0.28658071 0.15531334 0.17081136 0.22451114]
mean value: 0.2417426924167553
key: train_mcc
value: [0.60879886 0.60705186 0.60176036 0.58656334 0.57575076 0.59530928
0.58942501 0.59535136 0.58835735 0.58900706]
mean value: 0.5937375234360462
key: test_fscore
value: [0.33663366 0.28 0.31067961 0.34 0.35643564 0.18
0.35185185 0.27027027 0.25 0.31192661]
mean value: 0.29877976462078876
key: train_fscore
value: [0.63516068 0.63730084 0.63018868 0.61465271 0.6042065 0.6159769
0.61787072 0.62369668 0.61228407 0.61078998]
mean value: 0.6202127771514355
key: test_precision
value: [0.77272727 0.63636364 0.64 0.77272727 0.7826087 0.42857143
0.65517241 0.46875 0.52 0.56666667]
mean value: 0.6243587386501555
key: train_precision
value: [0.95726496 0.94707521 0.94886364 0.94169096 0.93491124 0.96385542
0.94202899 0.9454023 0.95223881 0.95770393]
mean value: 0.9491035446752083
key: test_recall
value: [0.21518987 0.17948718 0.20512821 0.21794872 0.23076923 0.11392405
0.24050633 0.18987342 0.16455696 0.21518987]
mean value: 0.19725738396624476
key: train_recall
value: [0.47524752 0.48022599 0.47175141 0.45621469 0.44632768 0.45261669
0.45968883 0.46534653 0.45120226 0.44837341]
mean value: 0.46069950215360517
key: test_accuracy
value: [0.76975945 0.75172414 0.75517241 0.77241379 0.77586207 0.71724138
0.75862069 0.72068966 0.73103448 0.74137931]
mean value: 0.7493897381206304
key: train_accuracy
value: [0.85210728 0.85178093 0.84986595 0.84488702 0.84144006 0.84718499
0.846036 0.84795098 0.84527001 0.84527001]
mean value: 0.8471793223776214
key: test_roc_auc
value: [0.59580248 0.57087567 0.58133769 0.59718191 0.60359216 0.52852601
0.59655648 0.55465235 0.55384246 0.57678925]
mean value: 0.5759156453945504
key: train_roc_auc
value: [0.73368262 0.73512088 0.73114633 0.72285248 0.71738349 0.72315708
0.72459231 0.72768377 0.72139945 0.72051023]
mean value: 0.725752865685757
key: test_jcc
value: [0.20238095 0.1627907 0.18390805 0.20481928 0.21686747 0.0989011
0.21348315 0.15625 0.14285714 0.18478261]
mean value: 0.1767040439541644
key: train_jcc
value: [0.46537396 0.46767538 0.4600551 0.44368132 0.43287671 0.44506259
0.44704264 0.45316804 0.44121715 0.43966713]
mean value: 0.4495820018656535
MCC on Blind test: 0.16
Accuracy on Blind test: 0.72
Running classifier: 9
Model_name: K-Nearest Neighbors
Model func: KNeighborsClassifier()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', KNeighborsClassifier())])
key: fit_time
value: [0.02079248 0.01870298 0.01637197 0.01716566 0.01867509 0.01888561
0.01881814 0.01857972 0.01686525 0.01802182]
mean value: 0.01828787326812744
key: score_time
value: [0.0488627 0.03017735 0.03352404 0.02759957 0.02723122 0.02828383
0.02774453 0.02777505 0.0267837 0.02872252]
mean value: 0.03067045211791992
key: test_mcc
value: [0.23046111 0.12079547 0.16428751 0.15007355 0.23403594 0.15185815
0.19856587 0.17955593 0.20710021 0.31368969]
mean value: 0.19504234281053048
key: train_mcc
value: [0.48370711 0.48573865 0.47695395 0.47536364 0.48705368 0.48426922
0.48102324 0.49240367 0.48922057 0.46802516]
mean value: 0.482375889649476
key: test_fscore
value: [0.37096774 0.2992126 0.29565217 0.34074074 0.37398374 0.3
0.36641221 0.336 0.32758621 0.46715328]
mean value: 0.34777087001604057
key: train_fscore
value: [0.57588899 0.58704794 0.57557643 0.57841484 0.58594412 0.57417103
0.57586207 0.58448276 0.57913043 0.57021277]
mean value: 0.5786731368051093
key: test_precision
value: [0.51111111 0.3877551 0.45945946 0.40350877 0.51111111 0.43902439
0.46153846 0.45652174 0.51351351 0.55172414]
mean value: 0.4695267798009669
key: train_precision
value: [0.74439462 0.72557173 0.72786177 0.71757322 0.73150106 0.74943052
0.73730684 0.74834437 0.751693 0.71581197]
mean value: 0.7349489100419229
key: test_recall
value: [0.29113924 0.24358974 0.21794872 0.29487179 0.29487179 0.2278481
0.30379747 0.26582278 0.24050633 0.40506329]
mean value: 0.27854592664719247
key: train_recall
value: [0.46958982 0.49293785 0.4759887 0.48446328 0.48870056 0.46534653
0.47241867 0.47949081 0.47100424 0.4738331 ]
mean value: 0.4773773563797058
key: test_accuracy
value: [0.73195876 0.69310345 0.72068966 0.69310345 0.73448276 0.71034483
0.7137931 0.7137931 0.73103448 0.74827586]
mean value: 0.7190579452541771
key: train_accuracy
value: [0.81264368 0.81194944 0.80965147 0.80850249 0.81271543 0.81309843
0.81156645 0.8153964 0.81463041 0.80658751]
mean value: 0.8116741724886312
key: test_roc_auc
value: [0.59368283 0.55104015 0.56180455 0.56724722 0.5955491 0.55942168
0.58554802 0.57366969 0.57759914 0.64092027]
mean value: 0.5806482651209672
key: train_roc_auc
value: [0.7048422 0.71178685 0.70488873 0.70676133 0.71098192 0.70378671
0.70495934 0.70980843 0.70661557 0.70199008]
mean value: 0.7066421141622449
key: test_jcc
value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
[0.22772277 0.17592593 0.17346939 0.20535714 0.23 0.17647059
0.22429907 0.20192308 0.19587629 0.3047619 ]
mean value: 0.21158061528160288
key: train_jcc
value: [0.4043849 0.41547619 0.40407674 0.40688019 0.41437126 0.40269278
0.40435835 0.41291108 0.40758874 0.39880952]
mean value: 0.4071549751948521
MCC on Blind test: 0.18
Accuracy on Blind test: 0.71
Running classifier: 10
Model_name: LDA
Model func: LinearDiscriminantAnalysis()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', LinearDiscriminantAnalysis())])
key: fit_time
value: [0.11840487 0.11046433 0.10493064 0.10589504 0.10549712 0.10513425
0.10571051 0.10633421 0.11581612 0.10470939]
mean value: 0.10828964710235596
key: score_time
value: [0.02614903 0.0133338 0.01327896 0.01327944 0.01336193 0.01322389
0.01352978 0.01324964 0.02602816 0.01329923]
mean value: 0.01587338447570801
key: test_mcc
value: [0.43327237 0.40819862 0.38977484 0.38888112 0.50893085 0.42987106
0.37582056 0.37432986 0.44923462 0.37572965]
mean value: 0.41340435620571114
key: train_mcc
value: [0.49983264 0.50765358 0.50104315 0.51215493 0.48375687 0.4926248
0.49027139 0.50381382 0.50140189 0.50049883]
mean value: 0.4993051904406385
key: test_fscore
value: [0.56944444 0.54014599 0.53521127 0.52554745 0.60606061 0.55474453
0.51798561 0.53061224 0.56934307 0.50746269]
mean value: 0.545655788298441
key: train_fscore
value: [0.61305732 0.62025316 0.61354582 0.62401264 0.59871589 0.6064
0.60697306 0.61697066 0.61477363 0.61428571]
mean value: 0.6128987896207706
key: test_precision
value: [0.63076923 0.62711864 0.59375 0.61016949 0.74074074 0.65517241
0.6 0.57352941 0.67241379 0.61818182]
mean value: 0.6321845543946268
key: train_precision
value: [0.70127505 0.70503597 0.70383912 0.7078853 0.69330855 0.69797422
0.69009009 0.70216606 0.70108696 0.69981917]
mean value: 0.7002480491170633
key: test_recall
value: [0.51898734 0.47435897 0.48717949 0.46153846 0.51282051 0.48101266
0.4556962 0.49367089 0.49367089 0.43037975]
mean value: 0.4809315157416423
key: train_recall
value: [0.54455446 0.55367232 0.54378531 0.5579096 0.52683616 0.53606789
0.5417256 0.55021216 0.54738331 0.54738331]
mean value: 0.5449530122503777
key: test_accuracy
value: [0.78694158 0.78275862 0.77241379 0.77586207 0.82068966 0.78965517
0.76896552 0.76206897 0.79655172 0.77241379]
mean value: 0.7828320891100841
key: train_accuracy
value: [0.8137931 0.81616239 0.81424741 0.81769437 0.80850249 0.81156645
0.81003447 0.8150134 0.81424741 0.81386442]
mean value: 0.8135125926121581
key: test_roc_auc
value: [0.7028899 0.68529269 0.68226899 0.67652395 0.72339139 0.69311296
0.67097606 0.67811506 0.70181175 0.66542684]
mean value: 0.6879809595161758
key: train_roc_auc
value: [0.72918737 0.7337463 0.72932828 0.73612769 0.72006548 0.72496672
0.72569473 0.73177625 0.73036182 0.73009922]
mean value: 0.7291353861774724
key: test_jcc
value: [0.39805825 0.37 0.36538462 0.35643564 0.43478261 0.38383838
0.34951456 0.36111111 0.39795918 0.34 ]
mean value: 0.37570843618015687
key: train_jcc
value: [0.44202067 0.44954128 0.44252874 0.45350172 0.42726231 0.43513203
0.43572241 0.44610092 0.44380734 0.44329897]
mean value: 0.441891639188729
MCC on Blind test: 0.23
Accuracy on Blind test: 0.73
Running classifier: 11
Model_name: Logistic Regression
Model func: LogisticRegression(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', LogisticRegression(random_state=42))])
key: fit_time
value: [0.06571412 0.06452179 0.08359385 0.10029984 0.07061386 0.06450844
0.06392503 0.07093811 0.07319975 0.07253933]
mean value: 0.07298541069030762
key: score_time
value: [0.01383138 0.01689005 0.01933289 0.01641679 0.01641417 0.02375102
0.02350068 0.01655054 0.01563954 0.01622486]
mean value: 0.017855191230773927
key: test_mcc
value: [0.44404567 0.36909197 0.40819862 0.45809054 0.51644394 0.4574942
0.40637249 0.31292341 0.46859818 0.35597233]
mean value: 0.4197231339328784
key: train_mcc
value: [0.47607536 0.48501099 0.48046944 0.47182581 0.46379243 0.4780113
0.4783075 0.48666277 0.48076222 0.49370188]
mean value: 0.4794619717791059
key: test_fscore
value: [0.55639098 0.5 0.54014599 0.57971014 0.60465116 0.57352941
0.54285714 0.48648649 0.58394161 0.49253731]
mean value: 0.546025023094389
key: train_fscore
value: [0.59042985 0.59691809 0.59150327 0.58804523 0.57980456 0.58986928
0.59053834 0.59640523 0.59283388 0.60260586]
mean value: 0.5918953579626515
key: test_precision
value: [0.68518519 0.61111111 0.62711864 0.66666667 0.76470588 0.68421053
0.62295082 0.52173913 0.68965517 0.6 ]
mean value: 0.6473343138220196
key: train_precision
value: [0.69201521 0.70095238 0.70155039 0.68679245 0.68461538 0.69825919
0.69749518 0.70599613 0.69865643 0.71017274]
mean value: 0.6976505491977688
key: test_recall
value: [0.46835443 0.42307692 0.47435897 0.51282051 0.5 0.49367089
0.48101266 0.4556962 0.50632911 0.41772152]
mean value: 0.4733041220382992
key: train_recall
value: [0.51485149 0.51977401 0.51129944 0.51412429 0.50282486 0.5106082
0.51202263 0.51626591 0.51485149 0.52333805]
mean value: 0.5139960364075148
key: test_accuracy
value: [0.79725086 0.77241379 0.78275862 0.8 0.82413793 0.8
0.77931034 0.73793103 0.80344828 0.76551724]
mean value: 0.786276810048584
key: train_accuracy
value: [0.80651341 0.80965147 0.80850249 0.80467254 0.80237457 0.8077365
0.8077365 0.81080046 0.80850249 0.81309843]
mean value: 0.8079588859980836
key: test_roc_auc
value: [0.69408288 0.66201016 0.68529269 0.70924045 0.72169811 0.70418141
0.68600396 0.64964905 0.71051053 0.65672806]
mean value: 0.6879397298021238
key: train_roc_auc
value: [0.71486137 0.71863635 0.71518729 0.7134468 0.70832257 0.71433772
0.71478232 0.71821699 0.71619675 0.72201566]
mean value: 0.7156003825599045
key: test_jcc
value: [0.38541667 0.33333333 0.37 0.40816327 0.43333333 0.40206186
0.37254902 0.32142857 0.41237113 0.32673267]
mean value: 0.37653898526339186
key: train_jcc
value: [0.41887227 0.42543353 0.4199536 0.41647597 0.40825688 0.41830823
0.41898148 0.42491269 0.4212963 0.43123543]
mean value: 0.42037263678481696
MCC on Blind test: 0.23
Accuracy on Blind test: 0.73
Running classifier: 12
Model_name: Logistic RegressionCV
Model func: LogisticRegressionCV(cv=3, random_state=42)
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', LogisticRegressionCV(cv=3, random_state=42))])
key: fit_time
value: [0.77692175 0.92846012 0.80041099 0.94950151 0.84537244 0.7996943
0.88558698 0.80906248 0.93071842 0.81407499]
mean value: 0.8539803981781006
key: score_time
value: [0.01339412 0.0134964 0.01339602 0.01341558 0.01345897 0.01346588
0.01339674 0.01347041 0.01356316 0.01343894]
mean value: 0.013449621200561524
key: test_mcc
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: train_mcc
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: test_fscore
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: train_fscore
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: test_precision
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: train_precision
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: test_recall
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: train_recall
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: test_accuracy
value: [0.72852234 0.73103448 0.73103448 0.73103448 0.73103448 0.72758621
0.72758621 0.72758621 0.72758621 0.72758621]
mean value: 0.7290591302287002
key: train_accuracy
value: [0.72911877 0.72883953 0.72883953 0.72883953 0.72883953 0.72922252
0.72922252 0.72922252 0.72922252 0.72922252]
mean value: 0.729058947482725
key: test_roc_auc
value: [0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5]
mean value: 0.5
key: train_roc_auc
value: [0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5]
mean value: 0.5
key: test_jcc
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: train_jcc
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
MCC on Blind test: 0.0
Accuracy on Blind test: 0.71
Running classifier: 13
Model_name: MLP
Model func: MLPClassifier(max_iter=500, random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', MLPClassifier(max_iter=500, random_state=42))])
key: fit_time
value: [2.5164814 3.25984097 6.25914717 2.37041879 3.45459938 2.44113636
4.65658593 1.6211729 1.87610316 2.94704747]
mean value: 3.1402533531188963
key: score_time
value: [0.01388025 0.01370716 0.01358414 0.01375294 0.01423216 0.01360965
0.01365423 0.01381755 0.01595449 0.0137217 ]
mean value: 0.013991427421569825
key: test_mcc
value: [0.40143417 0.34048752 0.36956363 0.42036538 0.52650277 0.35193375
0.39921331 0.36247844 0.45230334 0.31940096]
mean value: 0.3943683280526854
key: train_mcc
value: [0.46529397 0.52497296 0.6049342 0.49477149 0.52356274 0.47179846
0.565655 0.49896555 0.50479624 0.52106 ]
mean value: 0.5175810626560724
key: test_fscore
value: [0.5 0.4964539 0.51094891 0.52380952 0.63309353 0.45
0.55844156 0.53503185 0.57553957 0.44444444]
mean value: 0.5227763273173174
key: train_fscore
value: [0.54413103 0.63770365 0.69212411 0.59884202 0.64597191 0.54293629
0.67419112 0.62386707 0.61476726 0.61730449]
mean value: 0.619183893836618
key: test_precision
value: [0.68888889 0.55555556 0.59322034 0.6875 0.72131148 0.65853659
0.57333333 0.53846154 0.66666667 0.59574468]
mean value: 0.6279219063515787
key: train_precision
value: [0.7627551 0.70740103 0.79234973 0.72255489 0.67751938 0.78191489
0.72025723 0.66936791 0.71057514 0.74949495]
mean value: 0.7294190257807008
key: test_recall
value: [0.39240506 0.44871795 0.44871795 0.42307692 0.56410256 0.34177215
0.5443038 0.53164557 0.50632911 0.35443038]
mean value: 0.4555501460564752
key: train_recall
value: [0.42291372 0.58050847 0.61440678 0.51129944 0.61723164 0.41584158
0.63366337 0.58415842 0.5417256 0.52475248]
mean value: 0.5446501490342739
key: test_accuracy
value: [0.78694158 0.75517241 0.76896552 0.79310345 0.82413793 0.77241379
0.76551724 0.74827586 0.79655172 0.75862069]
mean value: 0.776970020144567
key: train_accuracy
value: [0.80804598 0.82114133 0.85178093 0.81424741 0.81654538 0.81041746
0.83416316 0.80926848 0.81616239 0.82382229]
mean value: 0.8205594808876681
key: test_roc_auc
value: [0.66318366 0.65832124 0.6677552 0.6761611 0.74195694 0.63771072
0.69632251 0.68051473 0.70577119 0.63219149]
mean value: 0.6759888796990773
key: train_roc_auc
value: [0.68702176 0.74558792 0.77725068 0.71912844 0.75396527 0.68638718
0.77113841 0.73850778 0.72989641 0.72981321]
mean value: 0.7338697055066847
key: test_jcc
value: [0.33333333 0.33018868 0.34313725 0.35483871 0.46315789 0.29032258
0.38738739 0.36521739 0.4040404 0.28571429]
mean value: 0.3557337920986424
key: train_jcc
value: [0.37375 0.46810934 0.52919708 0.42739079 0.47707424 0.37262357
0.50851305 0.45334797 0.4438007 0.44645006]
mean value: 0.4500256798709832
MCC on Blind test: 0.25
Accuracy on Blind test: 0.73
Running classifier: 14
Model_name: Multinomial
Model func: MultinomialNB()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', MultinomialNB())])
key: fit_time
value: [0.02427411 0.02475619 0.02484536 0.02448177 0.02479672 0.02454185
0.02467513 0.02867961 0.0286212 0.02721596]
mean value: 0.025688791275024415
key: score_time
value: [0.01460671 0.01415586 0.01425767 0.01446605 0.01420689 0.01448083
0.01447296 0.01408863 0.0134182 0.01418924]
mean value: 0.014234304428100586
key: test_mcc
value: [0.20163054 0.15820029 0.09992635 0.12485261 0.1635864 0.25188115
0.30776802 0.11271663 0.2123138 0.24690033]
mean value: 0.18797761237552896
key: train_mcc
value: [0.20411926 0.20878836 0.1975919 0.21036064 0.19552932 0.2024877
0.18745966 0.20168349 0.193292 0.20341092]
mean value: 0.2004723238556453
key: test_fscore
value: [0.42236025 0.38461538 0.35365854 0.34899329 0.38709677 0.44736842
0.47945205 0.37714286 0.43209877 0.43055556]
mean value: 0.40633418864097715
key: train_fscore
value: [0.41690544 0.42007168 0.42432432 0.41830065 0.4202601 0.39846154
0.4054247 0.42141864 0.41735537 0.40362812]
mean value: 0.41461505643750385
key: test_precision
value: [0.41463415 0.38461538 0.3372093 0.36619718 0.38961039 0.46575342
0.52238806 0.34375 0.42168675 0.47692308]
mean value: 0.41227677142614655
key: train_precision
value: [0.42235123 0.42649199 0.40673575 0.43049327 0.40770252 0.43676223
0.4092219 0.41450068 0.40671141 0.43344156]
mean value: 0.41944125557468787
key: test_recall
value: [0.43037975 0.38461538 0.37179487 0.33333333 0.38461538 0.43037975
0.44303797 0.41772152 0.44303797 0.39240506]
mean value: 0.40313209996754307
key: train_recall
value: [0.4115983 0.41384181 0.44350282 0.40677966 0.43361582 0.36633663
0.40169731 0.42857143 0.42857143 0.37765205]
mean value: 0.41121672699957645
key: test_accuracy
value: [0.68041237 0.66896552 0.63448276 0.66551724 0.67241379 0.71034483
0.73793103 0.62413793 0.68275862 0.71724138]
mean value: 0.6794205474582296
key: train_accuracy
value: [0.68812261 0.69015703 0.67368824 0.69322099 0.67560322 0.70049789
0.68096515 0.68134814 0.67598621 0.69781693]
mean value: 0.6857406404674593
key: test_roc_auc
value: [0.60198233 0.57910015 0.55146347 0.56053459 0.58145864 0.62277281
0.6456896 0.55957166 0.60777491 0.61563381]
mean value: 0.5925981969926242
key: train_roc_auc
value: [0.60122742 0.60340015 0.6014151 0.60328473 0.59962451 0.59545823
0.59318059 0.60189076 0.59821429 0.59717687]
mean value: 0.5994872649027037
key: test_jcc
value: [0.26771654 0.23809524 0.21481481 0.21138211 0.24 0.28813559
0.31531532 0.23239437 0.27559055 0.27433628]
mean value: 0.25577808112640427
key: train_jcc
value: [0.26334842 0.26588022 0.26929674 0.26446281 0.2660312 0.24879923
0.25425246 0.26696035 0.26370757 0.25284091]
mean value: 0.2615579907603406
MCC on Blind test: 0.26
Accuracy on Blind test: 0.73
Running classifier: 15
Model_name: Naive Bayes
Model func: BernoulliNB()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', BernoulliNB())])
key: fit_time
value: [0.02549672 0.02579665 0.02549148 0.02565217 0.02551889 0.02574134
0.025702 0.02566624 0.02579808 0.02559924]
mean value: 0.025646281242370606
key: score_time
value: [0.01339149 0.01335216 0.01347423 0.01341152 0.01343775 0.01344442
0.01337123 0.01344681 0.01344109 0.01340437]
mean value: 0.013417506217956543
key: test_mcc
value: [0.07911173 0.1331885 0.06943724 0.14080358 0.08136687 0.09102162
0.22947503 0.03130627 0.23894743 0.1859864 ]
mean value: 0.12806446644303063
key: train_mcc
value: [0.19065583 0.17038377 0.18001738 0.1781155 0.17571898 0.17717793
0.16171279 0.1676608 0.15780584 0.16216091]
mean value: 0.17214097348776364
key: test_fscore
value: [0.24793388 0.27586207 0.27272727 0.29752066 0.27692308 0.24137931
0.32432432 0.24242424 0.39694656 0.34645669]
mean value: 0.2922498098962689
key: train_fscore
value: [0.33394495 0.31481481 0.3496144 0.32447296 0.34116623 0.315197
0.29952607 0.34075342 0.33016422 0.30393996]
mean value: 0.3253594026335707
key: test_precision
value: [0.35714286 0.42105263 0.33333333 0.41860465 0.34615385 0.37837838
0.5625 0.30188679 0.5 0.45833333]
mean value: 0.4077385823536316
key: train_precision
value: [0.47519582 0.45698925 0.44444444 0.46214099 0.44444444 0.46796657
0.45402299 0.43167028 0.42444444 0.45125348]
mean value: 0.4512572721478286
key: test_recall
value: [0.18987342 0.20512821 0.23076923 0.23076923 0.23076923 0.17721519
0.2278481 0.20253165 0.32911392 0.27848101]
mean value: 0.23024991885751384
key: train_recall
value: [0.25742574 0.24011299 0.28813559 0.25 0.27683616 0.23762376
0.22347949 0.281471 0.27015559 0.2291372 ]
mean value: 0.25543775321842116
key: test_accuracy
value: [0.68728522 0.71034483 0.66896552 0.70689655 0.67586207 0.69655172
0.74137931 0.65517241 0.72758621 0.7137931 ]
mean value: 0.6983836947505628
key: train_accuracy
value: [0.72183908 0.71658368 0.70930678 0.71773267 0.71007277 0.72041363
0.71696668 0.70509383 0.70317886 0.71581769]
mean value: 0.7137005683293933
key: test_roc_auc
value: [0.53125746 0.55067731 0.53047896 0.55642235 0.53519594 0.53410523
0.5807487 0.5135881 0.60294559 0.57762913]
mean value: 0.5513048753726002
key: train_roc_auc
value: [0.57590152 0.5669824 0.57706832 0.57087493 0.57404603 0.56865432
0.56184479 0.57193298 0.56706309 0.56283541]
mean value: 0.5697203796539386
key: test_jcc
value: [0.14150943 0.16 0.15789474 0.17475728 0.16071429 0.1372549
0.19354839 0.13793103 0.24761905 0.20952381]
mean value: 0.17207529187552273
key: train_jcc
value: [0.20044053 0.18681319 0.21183801 0.19365427 0.20566632 0.18708241
0.1761427 0.20536636 0.19772257 0.17920354]
mean value: 0.19439298729374982
MCC on Blind test: 0.1
Accuracy on Blind test: 0.7
Running classifier: 16
Model_name: Passive Aggresive
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model',
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
key: fit_time
value: [0.04448915 0.03829861 0.03663588 0.03993273 0.03386569 0.05243778
0.04492474 0.04603219 0.04355192 0.06486773]
mean value: 0.044503641128540036
key: score_time
value: [0.01288319 0.0131712 0.01299167 0.01299238 0.01378965 0.01307154
0.01304054 0.01305079 0.01306844 0.01237059]
mean value: 0.01304299831390381
key: test_mcc
value: [0.39022844 0.12380816 0.27938703 0.24588084 0.23962774 0.4757615
0.05421797 0.09202868 0.2919632 0.16708881]
mean value: 0.23599923513520443
key: train_mcc
value: [0.29790693 0.16870503 0.27446316 0.29696096 0.1495214 0.47912722
0.28883605 0.26615061 0.28731088 0.17898885]
mean value: 0.26879710888915814
key: test_fscore
value: [0.42592593 0.44 0.28865979 0.25263158 0.14285714 0.62111801
0.18518519 0.13043478 0.3030303 0.07317073]
mean value: 0.28630134564987314
key: train_fscore
value: [0.33763441 0.45448635 0.29246344 0.28703704 0.08064516 0.62708472
0.32279171 0.21728395 0.26682409 0.11096433]
mean value: 0.2997215198672663
key: test_precision
value: [0.79310345 0.28308824 0.73684211 0.70588235 1. 0.6097561
0.34482759 0.46153846 0.75 1. ]
mean value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
0.6685038287080648
key: train_precision
value: [0.70403587 0.29518581 0.71823204 0.79487179 0.83333333 0.59343434
0.7047619 0.85436893 0.80714286 0.84 ]
mean value: 0.7145366895032236
key: test_recall
value: [0.29113924 0.98717949 0.17948718 0.15384615 0.07692308 0.63291139
0.12658228 0.07594937 0.18987342 0.03797468]
mean value: 0.27518662771827335
key: train_recall
value: [0.22206506 0.98728814 0.18361582 0.17514124 0.04237288 0.66478076
0.20933522 0.12446959 0.15983027 0.05940594]
mean value: 0.2828304924923485
key: test_accuracy
value: [0.78694158 0.32413793 0.76206897 0.75517241 0.75172414 0.78965517
0.69655172 0.72413793 0.76206897 0.73793103]
mean value: 0.7090389856618083
key: train_accuracy
value: [0.76398467 0.35733435 0.75909613 0.76407507 0.73803141 0.78590578
0.76216009 0.75718116 0.76216009 0.74224435]
mean value: 0.7192173107879866
key: test_roc_auc
value: [0.63141868 0.53368408 0.57795114 0.56513062 0.53846154 0.74062631
0.51826744 0.52138701 0.58308837 0.51898734]
mean value: 0.5729002529631338
key: train_roc_auc
value: [0.59369149 0.55512594 0.57840801 0.57916284 0.51960998 0.74783156
0.5883861 0.55829572 0.5728248 0.52760213]
mean value: 0.5820938574173901
key: test_jcc
value: [0.27058824 0.28205128 0.1686747 0.14457831 0.07692308 0.45045045
0.10204082 0.06976744 0.17857143 0.03797468]
mean value: 0.17816204270698482
key: train_jcc
value: [0.20310479 0.29406815 0.171278 0.16756757 0.04201681 0.45675413
0.19245774 0.12188366 0.15395095 0.05874126]
mean value: 0.18618230478094808
MCC on Blind test: 0.15
Accuracy on Blind test: 0.39
Running classifier: 17
Model_name: QDA
Model func: QuadraticDiscriminantAnalysis()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', QuadraticDiscriminantAnalysis())])
key: fit_time
value: [0.05962992 0.05943918 0.06080985 0.06234956 0.06290746 0.06261635
0.06332922 0.06211352 0.06102753 0.05955243]
mean value: 0.06137750148773193
key: score_time
value: [0.01472569 0.01415634 0.01428533 0.01442456 0.01433206 0.01440978
0.01443219 0.01438475 0.01441336 0.01440001]
mean value: 0.014396405220031739
key: test_mcc
value: [ 0.03995996 0.0953971 0.05468418 0.02060048 0.06370999 0.00847967
0.09512642 0.06484148 0.06484148 -0.0168643 ]
mean value: 0.04907764621219409
key: train_mcc
value: [0.10339309 0.1034458 0.10913888 0.10774049 0.10417245 0.10334557
0.10407151 0.10334557 0.10187991 0.09776187]
mean value: 0.10382951234670994
key: test_fscore
value: [0.42896936 0.43213296 0.42777778 0.42424242 0.42896936 0.4265928
0.43820225 0.43333333 0.43333333 0.41833811]
mean value: 0.4291891705196207
key: train_fscore
value: [0.43588163 0.43622921 0.43730698 0.43703704 0.43636364 0.4357473
0.43588163 0.4357473 0.4354789 0.43640898]
mean value: 0.4362082601681029
key: test_precision
value: [0.275 0.27561837 0.27304965 0.27017544 0.27402135 0.27304965
0.28158845 0.27758007 0.27758007 0.27037037]
mean value: 0.27480334166206594
key: train_precision
value: [0.2786756 0.27895981 0.2798419 0.27962085 0.27906977 0.2785658
0.2786756 0.2785658 0.27834646 0.27988804]
mean value: 0.27902096319974584
key: test_recall
value: [0.97468354 1. 0.98717949 0.98717949 0.98717949 0.97468354
0.98734177 0.98734177 0.98734177 0.92405063]
mean value: 0.9796981499513144
key: train_recall
value: [1. 1. 1. 1. 1. 1.
1. 1. 1. 0.99009901]
mean value: 0.9990099009900991
key: test_accuracy
value: [0.29553265 0.29310345 0.28965517 0.27931034 0.29310345 0.2862069
0.31034483 0.29655172 0.29655172 0.3 ]
mean value: 0.29403602322550065
key: train_accuracy
value: [0.29885057 0.29911911 0.30218307 0.30141708 0.29950211 0.29873612
0.29911911 0.29873612 0.29797013 0.307545 ]
mean value: 0.3003178418450675
key: test_roc_auc
value: [0.50856819 0.51650943 0.51009918 0.50302371 0.51245767 0.50155978
0.52210691 0.51262823 0.51262823 0.49520067]
mean value: 0.5094781995397277
key: train_roc_auc
value: [0.51918024 0.51918024 0.52128219 0.5207567 0.51944298 0.51917017
0.51943277 0.51917017 0.51864496 0.52209782]
mean value: 0.5198358245682732
key: test_jcc
value: [0.27304965 0.27561837 0.27208481 0.26923077 0.27304965 0.27112676
0.28057554 0.27659574 0.27659574 0.26449275]
mean value: 0.27324197833395414
key: train_jcc
value: [0.2786756 0.27895981 0.2798419 0.27962085 0.27906977 0.2785658
0.2786756 0.2785658 0.27834646 0.27910686]
mean value: 0.2789428445269598
MCC on Blind test: 0.09
Accuracy on Blind test: 0.35
Running classifier: 18
Model_name: Random Forest
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model',
RandomForestClassifier(n_estimators=1000, random_state=42))])
key: fit_time
value: [8.36072302 8.34088826 8.06986618 8.1633141 8.12759304 8.26576948
8.05865216 8.08983588 8.12928081 8.15175128]
mean value: 8.175767421722412
key: score_time
value: [0.13614035 0.1359272 0.1348474 0.14098811 0.13177514 0.13770175
0.13953996 0.13831949 0.12977481 0.14186239]
mean value: 0.13668766021728515
key: test_mcc
value: [0.46699603 0.2738941 0.41431032 0.45516377 0.46038057 0.36443233
0.43837552 0.36391157 0.44214159 0.37355384]
mean value: 0.4053159630056153
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.55555556 0.41269841 0.52713178 0.55813953 0.5483871 0.46280992
0.54263566 0.4962406 0.52459016 0.48818898]
mean value: 0.5116377700943858
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.74468085 0.54166667 0.66666667 0.70588235 0.73913043 0.66666667
0.7 0.61111111 0.74418605 0.64583333]
mean value: 0.6765824129743687
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.44303797 0.33333333 0.43589744 0.46153846 0.43589744 0.35443038
0.44303797 0.41772152 0.40506329 0.39240506]
mean value: 0.4122362869198312
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.80756014 0.74482759 0.78965517 0.80344828 0.80689655 0.77586207
0.79655172 0.76896552 0.8 0.77586207]
mean value: 0.7869629102974286
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
[0.6932171 0.61477987 0.67785438 0.69539187 0.68964683 0.64403983
0.68597396 0.65909773 0.67646529 0.65591817]
mean value: 0.6692385047225463
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.38461538 0.26 0.35789474 0.38709677 0.37777778 0.30107527
0.37234043 0.33 0.35555556 0.32291667]
mean value: 0.34492725900001575
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: 0.3
Accuracy on Blind test: 0.75
Running classifier: 19
Model_name: Random Forest2
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
n_estimators=1000, n_jobs=10, oob_score=True,
random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model',
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
n_estimators=1000, n_jobs=10,
oob_score=True, random_state=42))])
key: fit_time
value: [1.87300825 1.86339927 1.86294794 1.90790439 1.82021618 1.91629529
1.93531013 1.8841157 1.8937006 1.87140822]
mean value: 1.882830595970154
key: score_time
value: [0.2996664 0.19163322 0.32926655 0.34980989 0.37643075 0.36670566
0.2930491 0.36704397 0.37959981 0.36961794]
mean value: 0.3322823286056519
key: test_mcc
value: [0.45290899 0.28538215 0.38151281 0.47029374 0.43199103 0.36171133
0.44370218 0.3600921 0.43988089 0.35813368]
mean value: 0.39856088841203274
key: train_mcc
value: [0.78860144 0.79400756 0.79699964 0.79068767 0.79018513 0.7799536
0.80162962 0.78529325 0.79379008 0.7904664 ]
mean value: 0.791161436744533
key: test_fscore
value: [0.52892562 0.40983607 0.47933884 0.55284553 0.49122807 0.45378151
0.53225806 0.48854962 0.50847458 0.46774194]
mean value: 0.491297983421125
key: train_fscore
value: [0.82391482 0.82871126 0.83157038 0.82640587 0.82459016 0.81456954
0.83374283 0.82160393 0.82843137 0.82612245]
mean value: 0.8259662614044293
key: test_precision
value: [0.76190476 0.56818182 0.6744186 0.75555556 0.77777778 0.675
0.73333333 0.61538462 0.76923077 0.64444444]
mean value: 0.6975231680464239
key: train_precision
value: [0.97859922 0.98069498 0.98080614 0.97687861 0.98242188 0.98203593
0.99027237 0.97475728 0.98065764 0.97683398]
mean value: 0.9803958032540228
key: test_recall
value: [0.40506329 0.32051282 0.37179487 0.43589744 0.35897436 0.34177215
0.41772152 0.40506329 0.37974684 0.36708861]
mean value: 0.38036351833820187
key: train_recall
value: [0.71145686 0.71751412 0.72175141 0.71610169 0.71045198 0.69589816
0.71994342 0.71004243 0.71711457 0.71570014]
mean value: 0.7135974796026818
key: test_accuracy
value: [0.80412371 0.75172414 0.78275862 0.81034483 0.8 0.77586207
0.8 0.76896552 0.8 0.77241379]
mean value: 0.7866192676857447
key: train_accuracy
value: [0.91762452 0.91957105 0.92072003 0.91842206 0.91803907 0.91420912
0.92225201 0.91650709 0.91957105 0.91842206]
mean value: 0.918533804079704
key: test_roc_auc
value: [0.67894674 0.61544509 0.65287857 0.69200532 0.66061925 0.64008039
0.68042474 0.65513828 0.6685464 0.64562961]
mean value: 0.6589714399345485
key: train_roc_auc
value: [0.85283826 0.85612963 0.85824828 0.85489793 0.8528613 0.84558564
0.85865869 0.85160735 0.85593123 0.85469881]
mean value: 0.8541457113014491
key: test_jcc
value: [0.35955056 0.25773196 0.31521739 0.38202247 0.3255814 0.29347826
0.36263736 0.32323232 0.34090909 0.30526316]
mean value: 0.32656239746670157
key: train_jcc
value: [0.7005571 0.70752089 0.71169916 0.70416667 0.70153417 0.68715084
0.71488764 0.69722222 0.70711297 0.70375522]
mean value: 0.7035606882543431
MCC on Blind test: 0.3
Accuracy on Blind test: 0.75
Running classifier: 20
Model_name: Ridge Classifier
Model func: RidgeClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', RidgeClassifier(random_state=42))])
key: fit_time
value: [0.05000663 0.05711246 0.04777908 0.04916215 0.0484252 0.04848003
0.03723121 0.04868436 0.03878665 0.0301578 ]
mean value: 0.04558255672454834
key: score_time
value: [0.02048278 0.02044511 0.02054 0.02033067 0.02053881 0.02331877
0.02030563 0.02813411 0.02370739 0.02920508]
mean value: 0.02270083427429199
key: test_mcc
value: [0.46645565 0.40851227 0.36533193 0.43473704 0.52424076 0.40907622
0.38380869 0.24452881 0.40619757 0.36838608]
mean value: 0.40112750272436204
key: train_mcc
value: [0.46781082 0.46351535 0.46543376 0.46822503 0.44790608 0.4677778
0.47424835 0.4767288 0.46954873 0.47524738]
mean value: 0.46764421041183796
key: test_fscore
value: [0.57777778 0.512 0.5037037 0.54263566 0.59677419 0.52307692
0.5112782 0.40601504 0.515625 0.49230769]
mean value: 0.518119418241192
key: train_fscore
value: [0.5743073 0.57022708 0.56629598 0.5743073 0.5541347 0.56947997
0.57627119 0.5786802 0.56971771 0.57912458]
mean value: 0.5712546009932389
key: test_precision
value: [0.69642857 0.68085106 0.59649123 0.68627451 0.80434783 0.66666667
0.62962963 0.5 0.67346939 0.62745098]
mean value: 0.6561609863662966
key: train_precision
value: [0.70661157 0.7047817 0.71800434 0.70807453 0.69892473 0.7167382
0.71881607 0.72 0.72077922 0.71517672]
mean value: 0.7127907079802824
key: test_recall
value: [0.49367089 0.41025641 0.43589744 0.44871795 0.47435897 0.43037975
0.43037975 0.34177215 0.41772152 0.40506329]
mean value: 0.4288218111002921
key: train_recall
value: [0.48373409 0.47881356 0.46751412 0.48305085 0.45903955 0.47241867
0.48090523 0.48373409 0.47100424 0.48656294]
mean value: 0.4766777343593923
key: test_accuracy
value: [0.80412371 0.78965517 0.76896552 0.79655172 0.82758621 0.7862069
0.77586207 0.72758621 0.7862069 0.77241379]
mean value: 0.7835158194098827
key: train_accuracy
value: [0.80574713 0.80428954 0.80582152 0.80582152 0.7996936 0.80658751
0.80850249 0.80926848 0.8073535 0.80850249]
mean value: 0.8061587800508019
key: test_roc_auc
value: [0.7067411 0.66975085 0.66370343 0.68662313 0.71595307 0.67490551
0.66779651 0.60690503 0.67094607 0.65750795]
mean value: 0.6720832653820337
key: train_roc_auc
value: [0.70455753 0.70209727 0.69960047 0.70447866 0.69273575 0.70154547
0.70552615 0.70694057 0.70162607 0.70730458]
mean value: 0.7026412512967692
key: test_jcc
value: [0.40625 0.34408602 0.33663366 0.37234043 0.42528736 0.35416667
0.34343434 0.25471698 0.34736842 0.32653061]
mean value: 0.35108144912560824
key: train_jcc
value: [0.40282686 0.39882353 0.39498807 0.40282686 0.38325472 0.39809297
0.4047619 0.40714286 0.39832536 0.40758294]
mean value: 0.39986260504299165
MCC on Blind test: 0.22
Accuracy on Blind test: 0.73
Running classifier: 21
Model_name: Ridge ClassifierCV
Model func: RidgeClassifierCV(cv=3)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', RidgeClassifierCV(cv=3))])
key: fit_time
value: [0.1975565 0.18135667 0.18015242 0.20119381 0.17986298 0.18718934
0.34333491 0.24896049 0.198699 0.21950459]
mean value: 0.21378107070922853
key: score_time
value: [0.02017689 0.02014303 0.02007008 0.02163529 0.02003121 0.02201009
0.02396131 0.02285862 0.02004385 0.02002907]
mean value: 0.021095943450927735
key: test_mcc
value: [0.46699603 0.35963016 0.36533193 0.4436844 0.46810101 0.3795258
0.40619757 0.25176322 0.40619757 0.36838608]
mean value: 0.3915813779701217
key: train_mcc
value: [0.44578755 0.45820678 0.46543376 0.4470376 0.43116922 0.45616557
0.44484411 0.47230874 0.44981092 0.47524738]
mean value: 0.45460116471480727
key: test_fscore
value: [0.55555556 0.46280992 0.5037037 0.546875 0.53781513 0.48387097
0.515625 0.40909091 0.515625 0.49230769]
mean value: 0.5023278871805588
key: train_fscore
value: [0.546875 0.55709343 0.56629598 0.5492228 0.5323993 0.55166375
0.54370629 0.56721596 0.54529464 0.57912458]
mean value: 0.5538891716492834
key: test_precision
value: [0.74468085 0.65116279 0.59649123 0.7 0.7804878 0.66666667
0.67346939 0.50943396 0.67346939 0.62745098]
mean value: 0.6623313059542907
key: train_precision
value: [0.70786517 0.71875 0.71800434 0.70666667 0.70046083 0.72413793
0.71167048 0.73318386 0.72093023 0.71517672]
mean value: 0.7156846218914653
key: test_recall
value: [0.44303797 0.35897436 0.43589744 0.44871795 0.41025641 0.37974684
0.41772152 0.34177215 0.41772152 0.40506329]
mean value: 0.4058909444985394
key: train_recall
value: [0.44554455 0.45480226 0.46751412 0.44915254 0.42937853 0.44554455
0.43988685 0.46251768 0.43847242 0.48656294]
mean value: 0.45193764533838376
key: test_accuracy
value: [0.80756014 0.77586207 0.76896552 0.8 0.81034483 0.77931034
0.7862069 0.73103448 0.7862069 0.77241379]
mean value: 0.7817904965043252
key: train_accuracy
value: [0.8 0.80390655 0.80582152 0.8000766 0.79548066 0.80390655
0.8000766 0.80888548 0.80199157 0.80850249]
mean value: 0.8028648027575642
key: test_roc_auc
value: [0.6932171 0.64410982 0.66370343 0.68898162 0.68390179 0.65432839
0.67094607 0.6092747 0.67094607 0.65750795]
mean value: 0.6636916941932919
key: train_roc_auc
value: [0.68861568 0.69429551 0.69960047 0.68989419 0.68053267 0.69125967
0.68685519 0.70000884 0.6877236 0.70730458]
mean value: 0.6926090402380901
key: test_jcc
value: [0.38461538 0.30107527 0.33663366 0.37634409 0.36781609 0.31914894
0.34736842 0.25714286 0.34736842 0.32653061]
mean value: 0.3364043742437685
key: train_jcc
value: [0.37634409 0.38609113 0.39498807 0.37857143 0.3627685 0.3808948
0.37334934 0.39588378 0.37484885 0.40758294]
mean value: 0.3831322912054633
MCC on Blind test: 0.21
Accuracy on Blind test: 0.73
Running classifier: 22
Model_name: SVC
Model func: SVC(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', SVC(random_state=42))])
key: fit_time
value: [0.28892684 0.34198499 0.345222 0.27042603 0.27116203 0.2709589
0.34672308 0.32980013 0.27179193 0.26866913]
mean value: 0.30056650638580323
key: score_time
value: [0.0818913 0.08750343 0.09059858 0.08254004 0.08279061 0.08304667
0.0889411 0.09171605 0.08015299 0.07943845]
mean value: 0.08486192226409912
key: test_mcc
value: [0.36533828 0.26060918 0.30197942 0.34384959 0.38424069 0.23043042
0.40385865 0.1924525 0.33821687 0.32015637]
mean value: 0.3141131973399914
key: train_mcc
value: [0.43042951 0.44329605 0.42997701 0.43021536 0.40282325 0.4310017
0.43167293 0.4517468 0.43995525 0.43995525]
mean value: 0.4331073090066477
key: test_fscore
value: [0.37254902 0.32380952 0.35849057 0.40366972 0.38 0.30188679
0.46956522 0.31304348 0.38888889 0.40350877]
mean value: 0.37154119831494625
key: train_fscore
value: [0.48228346 0.4985451 0.47562189 0.46928499 0.43433299 0.46544715
0.45884774 0.5 0.48358209 0.48358209]
mean value: 0.4751527510747966
key: test_precision
value: [0.82608696 0.62962963 0.67857143 0.70967742 0.86363636 0.59259259
0.75 0.5 0.72413793 0.65714286]
mean value: 0.6931475178483932
key: train_precision
value: [0.79288026 0.79566563 0.8047138 0.81754386 0.81081081 0.8267148
0.84150943 0.81469649 0.81543624 0.81543624]
mean value: 0.8135407572999125
key: test_recall
value: [0.24050633 0.21794872 0.24358974 0.28205128 0.24358974 0.20253165
0.34177215 0.2278481 0.26582278 0.29113924]
mean value: 0.25567997403440434
key: train_recall
value: [0.34653465 0.36299435 0.33757062 0.32909605 0.29661017 0.32390382
0.31541726 0.36067893 0.3437058 0.3437058 ]
mean value: 0.3360217438208712
key: test_accuracy
value: [0.78006873 0.75517241 0.76551724 0.77586207 0.7862069 0.74482759
0.78965517 0.72758621 0.77241379 0.76551724]
mean value: 0.7662827349211991
key: train_accuracy
value: [0.79846743 0.80199157 0.79816162 0.79816162 0.79050172 0.79854462
0.79854462 0.80467254 0.80122558 0.80122558]
mean value: 0.7991496923566814
key: test_roc_auc
value: [0.6108192 0.58538945 0.60056846 0.61979923 0.6147194 0.57519947
0.64955906 0.57127002 0.61395405 0.6171336 ]
mean value: 0.6058411942066863
key: train_roc_auc
value: [0.65645177 0.66415613 0.65354621 0.65088538 0.63543068 0.64934687
0.64667922 0.66510837 0.65740962 0.65740962]
mean value: 0.6536423880481571
key: test_jcc
value: [0.22891566 0.19318182 0.2183908 0.25287356 0.2345679 0.17777778
0.30681818 0.18556701 0.24137931 0.25274725]
mean value: 0.2292219282880399
key: train_jcc
value: [0.31776913 0.33204134 0.31201044 0.30657895 0.27741083 0.30331126
0.29773031 0.33333333 0.31889764 0.31889764]
mean value: 0.31179808724112323
MCC on Blind test: 0.23
Accuracy on Blind test: 0.73
Running classifier: 23
Model_name: Stochastic GDescent
Model func: SGDClassifier(n_jobs=10, random_state=42)
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
from pandas import MultiIndex, Int64Index
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
from pandas import MultiIndex, Int64Index
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
from pandas import MultiIndex, Int64Index
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
from pandas import MultiIndex, Int64Index
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
from pandas import MultiIndex, Int64Index
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
from pandas import MultiIndex, Int64Index
/home/tanu/git/LSHTM_analysis/scripts/ml/ml_functions/MultClfs_logo_skf.py:419: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
scoresDF_CV['source_data'] = 'CV'
/home/tanu/git/LSHTM_analysis/scripts/ml/ml_functions/MultClfs_logo_skf.py:446: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
scoresDF_BT['source_data'] = 'BT'
Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', SGDClassifier(n_jobs=10, random_state=42))])
key: fit_time
value: [0.10167432 0.08535409 0.08589745 0.09299016 0.09449673 0.11019874
0.12004852 0.08432817 0.08467627 0.09961772]
mean value: 0.09592821598052978
key: score_time
value: [0.01093888 0.0112102 0.01123762 0.01108336 0.01130486 0.01933503
0.01113534 0.01121855 0.01124907 0.01119256]
mean value: 0.011990547180175781
key: test_mcc
value: [0.41652322 0.41133434 0.45720022 0.35963016 0.39621323 0.32835729
0.32796739 0.25438925 0.12686505 0.39275981]
mean value: 0.347123994781404
key: train_mcc
value: [0.49349664 0.46966611 0.46263839 0.4380867 0.47174711 0.40587973
0.43359224 0.44509497 0.15070895 0.48626212]
mean value: 0.42571729419345583
key: test_fscore
value: [0.59259259 0.51968504 0.58741259 0.46280992 0.5625 0.3963964
0.448 0.38016529 0.07228916 0.57458564]
mean value: 0.45964366143688473
key: train_fscore
value: [0.64332344 0.57383966 0.59355828 0.5311943 0.62242867 0.45381526
0.51872146 0.52994555 0.06811989 0.63859649]
mean value: 0.5173543006923467
key: test_precision
value: [0.50909091 0.67346939 0.64615385 0.65116279 0.54878049 0.6875
0.60869565 0.54761905 0.75 0.50980392]
mean value: 0.6132276042863998
key: train_precision
value: [0.55419223 0.71278826 0.64932886 0.71980676 0.58698373 0.78200692
0.73195876 0.73924051 0.92592593 0.5443669 ]
mean value: 0.6946598855863386
key: test_recall
value: [0.70886076 0.42307692 0.53846154 0.35897436 0.57692308 0.27848101
0.35443038 0.29113924 0.03797468 0.65822785]
mean value: 0.42265498214865305
key: train_recall
value: [0.76661952 0.48022599 0.54661017 0.42090395 0.66242938 0.31966054
0.40169731 0.41301273 0.03536068 0.77227723]
mean value: 0.4818797497183132
key: test_accuracy
value: [0.73539519 0.78965517 0.79655172 0.77586207 0.75862069 0.76896552
0.76206897 0.74137931 0.73448276 0.73448276]
mean value: 0.7597464154520678
key: train_accuracy
value: [0.7697318 0.80658751 0.79701264 0.79854462 0.78207583 0.79165071
0.79816162 0.80160858 0.73803141 0.76330908]
mean value: 0.7846713800000293
key: test_roc_auc
value: [0.72707189 0.67380261 0.71498549 0.64410982 0.70119739 0.61554382
0.63456116 0.60054592 0.51661767 0.71063051]
mean value: 0.6539066290516133
key: train_roc_auc
value: [0.7687538 0.7041172 0.71839179 0.67997379 0.74450949 0.64328615
0.67353773 0.67945805 0.51715513 0.76612811]
mean value: 0.6895311223145233
key: test_jcc
value: [0.42105263 0.35106383 0.41584158 0.30107527 0.39130435 0.24719101
0.28865979 0.23469388 0.0375 0.40310078]
mean value: 0.30914831199630954
key: train_jcc
value: [0.47419073 0.40236686 0.42202835 0.36165049 0.45183044 0.29350649
0.35018496 0.36049383 0.03526093 0.46907216]
mean value: 0.36205852453348547
MCC on Blind test: 0.33
Accuracy on Blind test: 0.68
Running classifier: 24
Model_name: XGBoost
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
colsample_bynode=None, colsample_bytree=None,
enable_categorical=False, gamma=None, gpu_id=None,
importance_type=None, interaction_constraints=None,
learning_rate=None, max_delta_step=None, max_depth=None,
min_child_weight=None, missing=nan, monotone_constraints=None,
n_estimators=100, n_jobs=None, num_parallel_tree=None,
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
scale_pos_weight=None, subsample=None, tree_method=None,
use_label_encoder=False, validate_parameters=None, verbosity=0)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', '...
interaction_constraints=None, learning_rate=None,
max_delta_step=None, max_depth=None,
min_child_weight=None, missing=nan,
monotone_constraints=None, n_estimators=100,
n_jobs=None, num_parallel_tree=None,
predictor=None, random_state=42, reg_alpha=None,
reg_lambda=None, scale_pos_weight=None,
subsample=None, tree_method=None,
use_label_encoder=False,
validate_parameters=None, verbosity=0))])
key: fit_time
value: [0.37790298 0.36533308 0.51010323 0.35950994 0.36229634 0.3688941
0.35796928 0.50260568 0.37871933 0.35440183]
mean value: 0.39377357959747317
key: score_time
value: [0.01200342 0.01274371 0.01206803 0.01224232 0.0122242 0.01256561
0.01236796 0.01211452 0.01169276 0.01193953]
mean value: 0.012196207046508789
key: test_mcc
value: [0.46645565 0.35024614 0.40447435 0.51621419 0.56273856 0.45880161
0.37680633 0.46935636 0.5323448 0.40277165]
mean value: 0.45402096353283505
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.57777778 0.49635036 0.54285714 0.63448276 0.65693431 0.58741259
0.53947368 0.60402685 0.62121212 0.54545455]
mean value: 0.5805982134715821
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.69642857 0.57627119 0.61290323 0.68656716 0.76271186 0.65625
0.56164384 0.64285714 0.77358491 0.609375 ]
mean value: 0.6578592896395544
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.49367089 0.43589744 0.48717949 0.58974359 0.57692308 0.53164557
0.51898734 0.56962025 0.51898734 0.49367089]
mean value: 0.5216325868224603
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.80412371 0.76206897 0.77931034 0.81724138 0.83793103 0.79655172
0.75862069 0.79655172 0.82758621 0.77586207]
mean value: 0.7955847849271239
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: [0.7067411 0.65898645 0.68698597 0.74534349 0.75544267 0.71369008
0.68366429 0.72556842 0.73105765 0.68759374]
mean value: 0.7095073870354812
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.40625 0.33009709 0.37254902 0.46464646 0.48913043 0.41584158
0.36936937 0.43269231 0.45054945 0.375 ]
mean value: 0.41061257181851013
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: 0.29
Accuracy on Blind test: 0.74
Extracting tts_split_name: logo_skf_BT_rpob
Total cols in each df:
CV df: 8
metaDF: 17
Adding column: Model_name
Total cols in bts df:
BT_df: 8
First proceeding to rowbind CV and BT dfs:
Final output should have: 25 columns
Combinig 2 using pd.concat by row ~ rowbind
Checking Dims of df to combine:
Dim of CV: (24, 8)
Dim of BT: (24, 8)
8
Number of Common columns: 8
These are: ['Precision', 'Accuracy', 'source_data', 'F1', 'Recall', 'MCC', 'ROC_AUC', 'JCC']
Concatenating dfs with different resampling methods [WF]:
Split type: logo_skf_BT_rpob
No. of dfs combining: 2
PASS: 2 dfs successfully combined
nrows in combined_df_wf: 48
ncols in combined_df_wf: 8
PASS: proceeding to merge metadata with CV and BT dfs
Adding column: Model_name
=========================================================
SUCCESS: Ran multiple classifiers
=======================================================
BTS gene: pnca
Total genes: 6
Training on: 5
Training on genes: ['alr', 'katg', 'gid', 'rpob', 'embb']
Omitted genes: ['pnca']
Blind test gene: pnca
/home/tanu/git/Data/ml_combined/6genes_logo_skf_BT_pnca.csv
Training data dim: (3609, 171)
Training Target dim: (3609,)
Checked training df does NOT have Target var
TEST data dim: (424, 171)
TEST Target dim: (424,)
==============================================================
Running several classification models (n): 24
List of models:
('AdaBoost Classifier', AdaBoostClassifier(random_state=42))
('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42))
('Decision Tree', DecisionTreeClassifier(random_state=42))
('Extra Tree', ExtraTreeClassifier(random_state=42))
('Extra Trees', ExtraTreesClassifier(random_state=42))
('Gradient Boosting', GradientBoostingClassifier(random_state=42))
('Gaussian NB', GaussianNB())
('Gaussian Process', GaussianProcessClassifier(random_state=42))
('K-Nearest Neighbors', KNeighborsClassifier())
('LDA', LinearDiscriminantAnalysis())
('Logistic Regression', LogisticRegression(random_state=42))
('Logistic RegressionCV', LogisticRegressionCV(cv=3, random_state=42))
('MLP', MLPClassifier(max_iter=500, random_state=42))
('Multinomial', MultinomialNB())
('Naive Bayes', BernoulliNB())
('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42))
('QDA', QuadraticDiscriminantAnalysis())
('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42))
('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
n_estimators=1000, n_jobs=10, oob_score=True,
random_state=42))
('Ridge Classifier', RidgeClassifier(random_state=42))
('Ridge ClassifierCV', RidgeClassifierCV(cv=3))
('SVC', SVC(random_state=42))
('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42))
('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
colsample_bynode=None, colsample_bytree=None,
enable_categorical=False, gamma=None, gpu_id=None,
importance_type=None, interaction_constraints=None,
learning_rate=None, max_delta_step=None, max_depth=None,
min_child_weight=None, missing=nan, monotone_constraints=None,
n_estimators=100, n_jobs=None, num_parallel_tree=None,
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
scale_pos_weight=None, subsample=None, tree_method=None,
use_label_encoder=False, validate_parameters=None, verbosity=0))
================================================================
Running classifier: 1
Model_name: AdaBoost Classifier
Model func: AdaBoostClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', AdaBoostClassifier(random_state=42))])
key: fit_time
value: [0.77215004 0.77167892 0.83274436 0.83441544 0.80741739 0.76621962
0.76329207 0.76404357 0.76457691 0.77009058]
mean value: 0.7846628904342652
key: score_time
value: [0.02036786 0.01913285 0.02113223 0.01935077 0.01921248 0.01863432
0.01867318 0.01878357 0.01871634 0.01871443]
mean value: 0.01927180290222168
key: test_mcc
value: [0.32675505 0.45345977 0.40620491 0.4218022 0.31345373 0.22998165
0.28171645 0.30022833 0.39519185 0.31020751]
mean value: 0.3439001461731883
key: train_mcc
value: [0.42797541 0.42000177 0.45280565 0.44649351 0.45885102 0.41942903
0.44781615 0.45170868 0.42809537 0.44335831]
mean value: 0.4396534903096055
key: test_fscore
value: [0.45205479 0.52941176 0.52903226 0.51428571 0.41481481 0.35971223
0.41891892 0.42758621 0.46969697 0.43356643]
mean value: 0.4549080105686176
key: train_fscore
value: [0.52801228 0.51073986 0.54919908 0.5350118 0.55351682 0.52220521
0.54573171 0.54878049 0.51787133 0.53685027]
mean value: 0.5347918841966184
key: test_precision
value: [0.55 0.72 0.5942029 0.66666667 0.58333333 0.48076923
0.50819672 0.53448276 0.68888889 0.54385965]
mean value: 0.5870400147263817
key: train_precision
value: [0.65648855 0.67154812 0.67669173 0.69105691 0.68301887 0.64583333
0.67041199 0.6741573 0.67775468 0.67843137]
mean value: 0.6725392846616618
key: test_recall
value: [0.38372093 0.41860465 0.47674419 0.41860465 0.32183908 0.28735632
0.35632184 0.35632184 0.35632184 0.36046512]
mean value: 0.3736300454423951
key: train_recall
value: [0.44159178 0.41206675 0.46213094 0.436457 0.46529563 0.43830334
0.46015424 0.46272494 0.41902314 0.44415918]
mean value: 0.4441906933614053
key: test_accuracy
value: [0.77839335 0.82271468 0.79778393 0.81163435 0.78116343 0.7534626
0.76177285 0.7700831 0.80609418 0.775 ]
mean value: 0.7858102493074792
key: train_accuracy
value: [0.81065271 0.81065271 0.81804187 0.81804187 0.82019704 0.80788177
0.81650246 0.81773399 0.81311576 0.81625115]
mean value: 0.8149071351245627
key: test_roc_auc
value: [0.64276956 0.68384778 0.687463 0.67657505 0.62442319 0.59440809
0.62341639 0.62889085 0.65261347 0.6327873 ]
mean value: 0.6447194686402419
key: train_roc_auc
value: [0.68434389 0.67423913 0.69623355 0.68744681 0.69863972 0.68129742
0.69444959 0.6961398 0.67813505 0.68888121]
mean value: 0.6879806155121887
key: test_jcc
value: [0.2920354 0.36 0.35964912 0.34615385 0.26168224 0.21929825
0.26495726 0.27192982 0.30693069 0.27678571]
mean value: 0.2959422352669331
key: train_jcc
value: [0.35870699 0.34294872 0.3785489 0.36519871 0.38266385 0.35336788
0.37526205 0.37815126 0.3494105 0.3669141 ]
mean value: 0.36511729574696794
MCC on Blind test: 0.17
Accuracy on Blind test: 0.59
Running classifier: 2
Model_name: Bagging Classifier
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model',
BaggingClassifier(n_jobs=10, oob_score=True,
random_state=42))])
key: fit_time
value: [0.3842423 0.35396051 0.33973908 0.45001793 0.43656397 0.41783166
0.32485485 0.44265795 0.42740798 0.4227972 ]
mean value: 0.4000073432922363
key: score_time
value: [0.02787042 0.03601909 0.04131699 0.04241252 0.04028273 0.03782964
0.02386832 0.05039573 0.04291034 0.0428853 ]
mean value: 0.03857910633087158
key: test_mcc
value: [0.34194329 0.30844062 0.33782411 0.36710731 0.37864481 0.25398285
0.2929364 0.39544386 0.37407011 0.3594838 ]
mean value: 0.34098771595592237
key: train_mcc
value: [0.95515036 0.94570979 0.95769371 0.95763727 0.9525293 0.93618269
0.9490388 0.94909662 0.95422839 0.95081794]
mean value: 0.9508084867582983
key: test_fscore
value: [0.45390071 0.4 0.46258503 0.47142857 0.46268657 0.38297872
0.4295302 0.5 0.46715328 0.44274809]
mean value: 0.4473011182847338
key: train_fscore
value: [0.96483079 0.95739015 0.96688742 0.9669749 0.96276596 0.94983278
0.96015936 0.96005326 0.96414343 0.96153846]
mean value: 0.9614576500329461
key: test_precision
value: [0.58181818 0.59090909 0.55737705 0.61111111 0.65957447 0.5
0.51612903 0.63157895 0.64 0.64444444]
mean value: 0.5932942325174748
key: train_precision
value: [0.99862637 0.9944675 0.99863201 0.99591837 0.99724518 0.9902371
0.99313187 0.99585635 0.99725275 0.99451303]
mean value: 0.9955880527072326
key: test_recall
value: [0.37209302 0.30232558 0.39534884 0.38372093 0.35632184 0.31034483
0.36781609 0.4137931 0.36781609 0.3372093 ]
mean value: 0.3606789628441593
key: train_recall
value: [0.93324775 0.92297818 0.93709884 0.93966624 0.93059126 0.9125964
0.92930591 0.92673522 0.93316195 0.93068036]
mean value: 0.9296062119057126
key: test_accuracy
value: [0.7867036 0.78393352 0.78116343 0.79501385 0.80055402 0.75900277
0.76454294 0.80055402 0.79778393 0.79722222]
mean value: 0.7866474299784549
key: train_accuracy
value: [0.98368227 0.98029557 0.98460591 0.98460591 0.98275862 0.97690887
0.98152709 0.98152709 0.98337438 0.98214835]
mean value: 0.9821434067625205
key: test_roc_auc
value: [0.64422833 0.61843552 0.64858351 0.65367865 0.64896384 0.60590234
0.62916352 0.66857538 0.65106133 0.63940757]
mean value: 0.6407999990066848
key: train_roc_auc
value: [0.96642137 0.96067904 0.96834691 0.96922559 0.96489077 0.9548812
0.96364081 0.96276032 0.96617612 0.96453046]
mean value: 0.9641552588840205
key: test_jcc
value: [0.29357798 0.25 0.30088496 0.30841121 0.30097087 0.23684211
0.27350427 0.33333333 0.3047619 0.28431373]
mean value: 0.2886600368496133
key: train_jcc
value: [0.93205128 0.91826309 0.93589744 0.93606138 0.92820513 0.9044586
0.92337165 0.92317542 0.93076923 0.92592593]
mean value: 0.9258179136968911
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
MCC on Blind test: 0.17
Accuracy on Blind test: 0.58
Running classifier: 3
Model_name: Decision Tree
Model func: DecisionTreeClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', DecisionTreeClassifier(random_state=42))])
key: fit_time
value: [0.21770072 0.25289273 0.2195797 0.22657871 0.24931931 0.23261094
0.21439838 0.24174213 0.21881509 0.23673463]
mean value: 0.23103723526000977
key: score_time
value: [0.01126862 0.01037169 0.01011014 0.01024437 0.01025963 0.01040983
0.01013398 0.01013708 0.0101459 0.01028323]
mean value: 0.01033644676208496
key: test_mcc
value: [0.24171066 0.25046703 0.18452135 0.31310782 0.25461513 0.20182099
0.26187794 0.25916104 0.20182099 0.18604786]
mean value: 0.23551508063109194
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.43093923 0.43575419 0.38418079 0.47674419 0.43820225 0.39306358
0.44808743 0.44067797 0.39306358 0.38888889]
mean value: 0.42296020949760765
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.41052632 0.41935484 0.37362637 0.47674419 0.42857143 0.39534884
0.42708333 0.43333333 0.39534884 0.37234043]
mean value: 0.4132277909360651
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.45348837 0.45348837 0.39534884 0.47674419 0.44827586 0.3908046
0.47126437 0.44827586 0.3908046 0.40697674]
mean value: 0.4335471798984229
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.71468144 0.72022161 0.69806094 0.75069252 0.72299169 0.70914127
0.72022161 0.72576177 0.70914127 0.69444444]
mean value: 0.7165358571868268
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: [0.624926 0.62856237 0.59403805 0.65655391 0.62924742 0.60051179
0.63526722 0.63107224 0.60051179 0.59582414]
mean value: 0.6196514930679904
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.27464789 0.27857143 0.23776224 0.3129771 0.28057554 0.24460432
0.28873239 0.2826087 0.24460432 0.24137931]
mean value: 0.268646322591932
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: 0.11
Accuracy on Blind test: 0.55
Running classifier: 4
Model_name: Extra Tree
Model func: ExtraTreeClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', ExtraTreeClassifier(random_state=42))])
key: fit_time
value: [0.0235827 0.02561474 0.02579784 0.02664256 0.02643132 0.02380562
0.02473378 0.02528048 0.02598286 0.02546954]
mean value: 0.02533414363861084
key: score_time
value: [0.01046085 0.01110983 0.01143718 0.01104474 0.01122117 0.01019502
0.01122212 0.01113129 0.01133728 0.01112151]
mean value: 0.011028099060058593
key: test_mcc
value: [0.08018931 0.26852091 0.05361522 0.16476466 0.18661703 0.10751558
0.13263232 0.09677359 0.212486 0.24677356]
mean value: 0.1549888166504213
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.30057803 0.44571429 0.27906977 0.3625731 0.38150289 0.33513514
0.34285714 0.32608696 0.42574257 0.42774566]
mean value: 0.36270055509381693
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.29885057 0.43820225 0.27906977 0.36470588 0.38372093 0.31632653
0.34090909 0.30927835 0.37391304 0.42528736]
mean value: 0.35302637737679143
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.30232558 0.45348837 0.27906977 0.36046512 0.37931034 0.35632184
0.34482759 0.34482759 0.49425287 0.43023256]
mean value: 0.3745121625233895
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.66481994 0.73130194 0.6565097 0.69806094 0.70360111 0.65927978
0.68144044 0.6565097 0.67867036 0.725 ]
mean value: 0.6855193905817174
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: [0.5402537 0.6358351 0.52680761 0.58205074 0.59293984 0.55589815
0.56657438 0.55015102 0.61573958 0.6238754 ]
mean value: 0.5790125510692594
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.17687075 0.28676471 0.16216216 0.22142857 0.23571429 0.2012987
0.20689655 0.19480519 0.27044025 0.27205882]
mean value: 0.22284399964164647
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: 0.01
Accuracy on Blind test: 0.48
Running classifier: 5
Model_name: Extra Trees
Model func: ExtraTreesClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', ExtraTreesClassifier(random_state=42))])
key: fit_time
value: [0.60310817 0.59252024 0.59765458 0.59123945 0.61338592 0.62794566
0.6231389 0.62079525 0.60074377 0.61075974]
mean value: 0.6081291675567627
key: score_time
value: [0.02756548 0.02866197 0.02874327 0.02862692 0.02833867 0.02863312
0.02805734 0.028826 0.02721953 0.02619004]
mean value: 0.028086233139038085
key: test_mcc
value: [0.31217723 0.32404768 0.26622424 0.35178937 0.22579731 0.27567656
0.19953838 0.30165761 0.24205359 0.25276415]
mean value: 0.27517261187291425
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.39370079 0.35087719 0.352 0.38983051 0.3 0.36220472
0.31007752 0.40298507 0.33333333 0.33870968]
mean value: 0.35337188180274554
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.6097561 0.71428571 0.56410256 0.71875 0.54545455 0.575
0.47619048 0.57446809 0.53846154 0.55263158]
mean value: 0.5869100600109565
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.29069767 0.23255814 0.25581395 0.26744186 0.20689655 0.26436782
0.22988506 0.31034483 0.24137931 0.24418605]
mean value: 0.25435712376369957
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.7867036 0.79501385 0.77562327 0.80055402 0.76731302 0.77562327
0.7534626 0.77839335 0.76731302 0.77222222]
mean value: 0.7772222222222223
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: [0.61625793 0.60173362 0.59699789 0.61735729 0.57607601 0.60116201
0.57479654 0.61867606 0.58784294 0.59107113]
mean value: 0.5981971418420355
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.24509804 0.21276596 0.21359223 0.24210526 0.17647059 0.22115385
0.18348624 0.25233645 0.2 0.2038835 ]
mean value: 0.21508921094951106
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: 0.26
Accuracy on Blind test: 0.57
Running classifier: 6
Model_name: Gradient Boosting
Model func: GradientBoostingClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', GradientBoostingClassifier(random_state=42))])
key: fit_time
value: [3.71759415 3.70712686 3.69241643 3.70670629 3.7084744 3.70128179
3.69505262 3.68422008 3.69860005 3.70045066]
mean value: 3.701192331314087
key: score_time
value: [0.01056957 0.01049757 0.01053238 0.01087141 0.01057458 0.0104773
0.01067162 0.01062346 0.01045609 0.01050758]
mean value: 0.010578155517578125
key: test_mcc
value: [0.38266256 0.45563457 0.38741008 0.51134209 0.32853958 0.31316233
0.34649315 0.42565945 0.41876173 0.36160144]
mean value: 0.39312669785575316
key: train_mcc
value: [0.60432748 0.59294601 0.60829163 0.61437764 0.60371303 0.58864182
0.61598115 0.60898714 0.59341859 0.60523269]
mean value: 0.6035917183051488
key: test_fscore
value: [0.46616541 0.50393701 0.50666667 0.57971014 0.42105263 0.43356643
0.46575342 0.4962406 0.464 0.42857143]
mean value: 0.47656637528801565
key: train_fscore
value: [0.64772727 0.64485235 0.65489567 0.66242038 0.65168539 0.64449723
0.66666667 0.6624705 0.64251208 0.65498008]
mean value: 0.6532707616768082
key: test_precision
value: [0.65957447 0.7804878 0.59375 0.76923077 0.60869565 0.55357143
0.57627119 0.7173913 0.76315789 0.675 ]
mean value: 0.6697130508464613
key: train_precision
value: [0.8807947 0.85232068 0.87366167 0.8721174 0.86752137 0.83917526
0.8647541 0.85395538 0.85991379 0.86344538]
mean value: 0.8627659717869314
key: test_recall
value: [0.36046512 0.37209302 0.44186047 0.46511628 0.32183908 0.35632184
0.3908046 0.37931034 0.33333333 0.31395349]
mean value: 0.37350975674953224
key: train_recall
value: [0.51219512 0.51861361 0.5237484 0.53401797 0.5218509 0.52313625
0.54241645 0.54113111 0.51285347 0.52759949]
mean value: 0.5257562757605657
key: test_accuracy
value: [0.8033241 0.82548476 0.79501385 0.83933518 0.7867036 0.77562327
0.78393352 0.81440443 0.81440443 0.8 ]
mean value: 0.8038227146814405
key: train_accuracy
value: [0.86637931 0.86299261 0.86761084 0.86945813 0.86637931 0.86176108
0.87007389 0.86791872 0.86330049 0.86672822]
mean value: 0.8662602608305396
key: test_roc_auc
value: [0.65114165 0.66968288 0.67365751 0.71073996 0.62807282 0.63254048
0.64978186 0.66593254 0.65024331 0.63325412]
mean value: 0.6565047124822645
key: train_roc_auc
value: [0.74516196 0.74513102 0.74992604 0.75465581 0.74837484 0.74577865
0.7578479 0.75599065 0.74326884 0.75064185]
mean value: 0.7496777567985946
key: test_jcc
value: [0.30392157 0.33684211 0.33928571 0.40816327 0.26666667 0.27678571
0.30357143 0.33 0.30208333 0.27272727]
mean value: 0.31400470690668614
key: train_jcc
value: [0.4789916 0.47585395 0.48687351 0.4952381 0.48333333 0.47546729
0.5 0.49529412 0.47330961 0.48696682]
mean value: 0.4851328319934076
MCC on Blind test: 0.22
Accuracy on Blind test: 0.6
Running classifier: 7
Model_name: Gaussian NB
Model func: GaussianNB()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', GaussianNB())])
key: fit_time
value: [0.02247977 0.02245164 0.02236724 0.02235341 0.02241302 0.02222848
0.02236891 0.02161169 0.02214766 0.02301431]
mean value: 0.02234361171722412
key: score_time
value: [0.01067185 0.01042628 0.01066589 0.01065183 0.01031733 0.01047921
0.01044941 0.01043582 0.01042795 0.01032591]
mean value: 0.010485148429870606
key: test_mcc
value: [0.26154289 0.21117259 0.16567978 0.21779352 0.17525947 0.25036911
0.23088997 0.20575683 0.24693388 0.15085513]
mean value: 0.21162531578248064
key: train_mcc
value: [0.2329678 0.21791454 0.22941424 0.22714007 0.22303597 0.2307595
0.23908074 0.22976342 0.21865795 0.22896865]
mean value: 0.22777028638938351
key: test_fscore
value: [0.46956522 0.43983402 0.41434263 0.44351464 0.41975309 0.46721311
0.45454545 0.43478261 0.46692607 0.4 ]
mean value: 0.4410476850574975
key: train_fscore
value: [0.45488029 0.44645278 0.45227062 0.45119705 0.45026643 0.45333333
0.4567079 0.45227062 0.44722719 0.45284738]
mean value: 0.4517453601227384
key: test_precision
value: [0.375 0.34193548 0.31515152 0.34640523 0.32692308 0.36305732
0.35483871 0.34965035 0.35294118 0.31543624]
mean value: 0.3441339106953589
key: train_precision
value: [0.35463029 0.34449093 0.35387962 0.35175879 0.34396201 0.35289907
0.36253776 0.35362319 0.34293553 0.3509887 ]
mean value: 0.3511705904680436
key: test_recall
value: [0.62790698 0.61627907 0.60465116 0.61627907 0.5862069 0.65517241
0.63218391 0.57471264 0.68965517 0.54651163]
mean value: 0.6149558941459503
key: train_recall
value: [0.63414634 0.63414634 0.62644416 0.62901155 0.65167095 0.63367609
0.61696658 0.62724936 0.64267352 0.63799743]
mean value: 0.6333982331840637
key: test_accuracy
value: [0.66204986 0.62603878 0.59279778 0.63157895 0.60941828 0.6398892
0.63434903 0.6398892 0.62049861 0.60833333]
mean value: 0.6264843028624192
key: train_accuracy
value: [0.63546798 0.62284483 0.63608374 0.63300493 0.61884236 0.63392857
0.64839901 0.63608374 0.61945813 0.6303478 ]
mean value: 0.631446109981548
key: test_roc_auc
value: [0.65031712 0.62268499 0.59687104 0.62632135 0.60149761 0.64510446
0.6336102 0.61764829 0.64409766 0.58712443]
mean value: 0.6225277148234729
key: train_roc_auc
value: [0.63501566 0.6267127 0.63278466 0.63163822 0.63008649 0.63384209
0.63763309 0.63305788 0.62740964 0.63296633]
mean value: 0.6321146743667377
key: test_jcc
value: [0.30681818 0.28191489 0.26130653 0.28494624 0.265625 0.30481283
0.29411765 0.27777778 0.30456853 0.25 ]
mean value: 0.2831887631637641
key: train_jcc
value: [0.29439809 0.28737638 0.29221557 0.29131986 0.29054441 0.29310345
0.29593095 0.29221557 0.28801843 0.29269729]
mean value: 0.2917820004060983
MCC on Blind test: 0.15
Accuracy on Blind test: 0.58
Running classifier: 8
Model_name: Gaussian Process
Model func: GaussianProcessClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', GaussianProcessClassifier(random_state=42))])
key: fit_time
value: [4.45087433 4.31354117 4.39370012 4.36822677 4.42704654 4.4370153
4.41505051 4.33542299 4.51664448 4.52926946]
mean value: 4.418679165840149
key: score_time
value: [0.11832476 0.11112666 0.11127853 0.11812425 0.12398505 0.11338377
0.11118126 0.1113627 0.11137271 0.11074758]
mean value: 0.11408872604370117
key: test_mcc
value: [0.18264804 0.30543378 0.19438765 0.20051854 0.14666096 0.07740777
0.1001233 0.21525238 0.16919034 0.15000239]
mean value: 0.17416251472573485
key: train_mcc
value: [0.48373288 0.50015054 0.49360909 0.4959087 0.49977474 0.4851356
0.50546856 0.50016131 0.48396611 0.49592842]
mean value: 0.49438359437246515
key: test_fscore
value: [0.19417476 0.27184466 0.22429907 0.21153846 0.18691589 0.11764706
0.16513761 0.22641509 0.16 0.14285714]
mean value: 0.19008297429844115
key: train_fscore
value: [0.45256917 0.4772066 0.47184466 0.4748062 0.47969052 0.46393762
0.48700674 0.48315688 0.46243902 0.4748062 ]
mean value: 0.47274636161243444
key: test_precision
value: [0.58823529 0.82352941 0.57142857 0.61111111 0.5 0.4
0.40909091 0.63157895 0.61538462 0.58333333]
mean value: 0.5733692193599313
key: train_precision
value: [0.98283262 0.97619048 0.96812749 0.96837945 0.96875 0.95967742
0.96934866 0.96168582 0.95951417 0.96837945]
mean value: 0.9682885549690645
key: test_recall
value: [0.11627907 0.1627907 0.13953488 0.12790698 0.11494253 0.06896552
0.10344828 0.13793103 0.09195402 0.08139535]
mean value: 0.11451483560545309
key: train_recall
value: [0.29396662 0.31578947 0.31193838 0.31450578 0.31876607 0.3059126
0.3251928 0.32262211 0.30462725 0.31450578]
mean value: 0.3127826855998231
key: test_accuracy
value: [0.7700831 0.79224377 0.7700831 0.77285319 0.75900277 0.75069252
0.74792244 0.77285319 0.76731302 0.76666667]
mean value: 0.7669713758079408
key: train_accuracy
value: [0.8294335 0.83405172 0.83251232 0.83312808 0.83435961 0.83066502
0.83589901 0.83466749 0.83035714 0.83317944]
mean value: 0.8328253331453256
key: test_roc_auc
value: [0.54541226 0.5759408 0.55340381 0.55122622 0.53922309 0.5180594
0.52800151 0.55619179 0.53685292 0.53157359]
mean value: 0.5435885392360377
key: train_roc_auc
value: [0.64617327 0.65667967 0.6543491 0.6556328 0.6577636 0.65093201
0.66097697 0.65928676 0.65028933 0.65563346]
mean value: 0.6547716964319119
key: test_jcc
value: [0.10752688 0.15730337 0.12631579 0.11827957 0.10309278 0.0625
0.09 0.12765957 0.08695652 0.07692308]
mean value: 0.10565575685085513
key: train_jcc
value: [0.29246488 0.3133758 0.30876747 0.31130877 0.31552163 0.30203046
0.32188295 0.31852792 0.30076142 0.31130877]
mean value: 0.30959500583103455
MCC on Blind test: 0.22
Accuracy on Blind test: 0.5
Running classifier: 9
Model_name: K-Nearest Neighbors
Model func: KNeighborsClassifier()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', KNeighborsClassifier())])
key: fit_time
value: [0.0254252 0.0202651 0.01994061 0.01987982 0.01798296 0.01940918
0.0202384 0.02061677 0.02010465 0.01829433]
mean value: 0.020215702056884766
key: score_time
value: [0.04770255 0.03151655 0.04156685 0.04237723 0.04240131 0.03961754
0.03060174 0.02881145 0.02755308 0.02802491]
mean value: 0.036017322540283205
key: test_mcc
value: [0.20873959 0.12955387 0.21592555 0.22432932 0.12874875 0.19911673
0.112715 0.0981656 0.05272957 0.15632449]
mean value: 0.15263484716155137
key: train_mcc
value: [0.40443494 0.4100585 0.39596645 0.4220288 0.4234343 0.41925957
0.44036209 0.42209722 0.42576808 0.39723916]
mean value: 0.4160649107828426
key: test_fscore
value: [0.33823529 0.26153846 0.3255814 0.32 0.23140496 0.31818182
0.24615385 0.25 0.2 0.28787879]
mean value: 0.2778974561897084
key: train_fscore
value: [0.47675402 0.49303849 0.47315436 0.49625312 0.4970809 0.49876543
0.51243781 0.49792531 0.49366019 0.4744342 ]
mean value: 0.49135038335310843
key: test_precision
value: [0.46 0.38636364 0.48837209 0.51282051 0.41176471 0.46666667
0.37209302 0.34693878 0.30232558 0.41304348]
mean value: 0.4160388473178661
key: train_precision
value: [0.6980198 0.68099548 0.68280872 0.70616114 0.70783848 0.69336384
0.72196262 0.70257611 0.72098765 0.68357488]
mean value: 0.6998288718227317
key: test_recall
value: [0.26744186 0.19767442 0.24418605 0.23255814 0.16091954 0.24137931
0.18390805 0.1954023 0.14942529 0.22093023]
mean value: 0.2093825180433039
key: train_recall
value: [0.36200257 0.38639281 0.36200257 0.38254172 0.38303342 0.38946015
0.39717224 0.38560411 0.37532134 0.36328626]
mean value: 0.37868171903204617
key: test_accuracy
value: [0.75069252 0.73407202 0.75900277 0.76454294 0.74238227 0.75069252
0.72853186 0.71745152 0.71191136 0.73888889]
mean value: 0.7398168667282241
key: train_accuracy
value: [0.80942118 0.80942118 0.80665025 0.81373153 0.81434729 0.8125
0.81896552 0.81373153 0.81557882 0.80701754]
mean value: 0.8121364834500044
key: test_roc_auc
value: [0.58463002 0.5497463 0.58209302 0.58173362 0.54396342 0.57689403
0.54268395 0.53930699 0.51996812 0.56119504]
mean value: 0.5582214514569538
key: train_roc_auc
value: [0.65629492 0.66464234 0.65447232 0.66615948 0.66661792 0.66760457
0.67449705 0.66709355 0.66478617 0.65512491]
mean value: 0.6637293250719122
key: test_jcc
value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
[0.20353982 0.15044248 0.19444444 0.19047619 0.13084112 0.18918919
0.14035088 0.14285714 0.11111111 0.16814159]
mean value: 0.16213939705716976
key: train_jcc
value: [0.31298557 0.32717391 0.30989011 0.33001107 0.33074362 0.33223684
0.34448161 0.33149171 0.32772166 0.31098901]
mean value: 0.32577251191274537
MCC on Blind test: 0.19
Accuracy on Blind test: 0.53
Running classifier: 10
Model_name: LDA
Model func: LinearDiscriminantAnalysis()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', LinearDiscriminantAnalysis())])
key: fit_time
value: [0.13144374 0.12404513 0.12337565 0.12458515 0.12356472 0.12306428
0.12256145 0.12534761 0.12341237 0.1240747 ]
mean value: 0.12454748153686523
key: score_time
value: [0.01608467 0.01336503 0.01332116 0.01332617 0.01332116 0.01342273
0.01332378 0.01338816 0.01331568 0.01334023]
mean value: 0.013620877265930175
key: test_mcc
value: [0.37813254 0.440632 0.32849734 0.4176609 0.29267128 0.28111788
0.24164227 0.33784633 0.3421571 0.34783264]
mean value: 0.3408190288016472
key: train_mcc
value: [0.40544808 0.39891642 0.40933593 0.40112635 0.41844998 0.4290699
0.44359061 0.41812964 0.40411331 0.41003959]
mean value: 0.4138219809050511
key: test_fscore
value: [0.4822695 0.5112782 0.44755245 0.51748252 0.38461538 0.42384106
0.39473684 0.45070423 0.45833333 0.43076923]
mean value: 0.450158273984776
key: train_fscore
value: [0.50314465 0.49321628 0.5031746 0.49802372 0.50955414 0.5201581
0.53375196 0.51184834 0.49188312 0.4983871 ]
mean value: 0.5063142013710566
key: test_precision
value: [0.61818182 0.72340426 0.56140351 0.64912281 0.58139535 0.5
0.46153846 0.58181818 0.57894737 0.63636364]
mean value: 0.5892175386268983
key: train_precision
value: [0.64908722 0.65189873 0.65904366 0.64814815 0.66945607 0.67556468
0.68548387 0.66393443 0.66740088 0.670282 ]
mean value: 0.6640299685050933
key: test_recall
value: [0.39534884 0.39534884 0.37209302 0.43023256 0.28735632 0.36781609
0.34482759 0.36781609 0.37931034 0.3255814 ]
mean value: 0.36657310879444
key: train_recall
value: [0.41078306 0.39666239 0.40693196 0.40436457 0.41131105 0.42287918
0.43701799 0.41645244 0.38946015 0.39666239]
mean value: 0.409252518719207
key: test_accuracy
value: [0.79778393 0.8199446 0.78116343 0.80886427 0.77839335 0.75900277
0.74515235 0.78393352 0.78393352 0.79444444]
mean value: 0.7852616189596799
key: train_accuracy
value: [0.80541872 0.80449507 0.80726601 0.80449507 0.81034483 0.81311576
0.81711823 0.80972906 0.80726601 0.80855648]
mean value: 0.8087805247389497
key: test_roc_auc
value: [0.6594926 0.67403805 0.64059197 0.67875264 0.61083145 0.62551389
0.60854518 0.64193724 0.64585955 0.63359362]
mean value: 0.6419156187635107
key: train_roc_auc
value: [0.6703571 0.66491686 0.67025416 0.66755288 0.67367172 0.67945578
0.68693005 0.67502784 0.66416328 0.66756196]
mean value: 0.6719891625986338
key: test_jcc
value: [0.31775701 0.34343434 0.28828829 0.3490566 0.23809524 0.26890756
0.24590164 0.29090909 0.2972973 0.2745098 ]
mean value: 0.2914156877434678
key: train_jcc
value: [0.33613445 0.32733051 0.33616119 0.33157895 0.34188034 0.35149573
0.3640257 0.34394904 0.32615716 0.33190118]
mean value: 0.33906142459767785
MCC on Blind test: 0.3
Accuracy on Blind test: 0.65
Running classifier: 11
Model_name: Logistic Regression
Model func: LogisticRegression(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', LogisticRegression(random_state=42))])
key: fit_time
value: [0.07337356 0.073246 0.07456088 0.0758307 0.07262731 0.07064414
0.07337832 0.0730896 0.08132815 0.07977676]
mean value: 0.07478554248809814
key: score_time
value: [0.0159266 0.015172 0.01418328 0.01422954 0.01556015 0.01554537
0.0155468 0.0198226 0.01539826 0.01543546]
mean value: 0.015682005882263185
key: test_mcc
value: [0.35984275 0.39659568 0.25147017 0.42769845 0.27617044 0.26156861
0.22998165 0.32090441 0.39599152 0.30378981]
mean value: 0.3224013480814192
key: train_mcc
value: [0.38206622 0.36495082 0.38681148 0.38448879 0.3955001 0.38850064
0.39901178 0.37601642 0.37512447 0.38448993]
mean value: 0.38369606479013674
key: test_fscore
value: [0.44274809 0.46511628 0.38297872 0.49230769 0.32478632 0.37956204
0.35971223 0.41791045 0.48920863 0.37398374]
mean value: 0.4128314205874658
key: train_fscore
value: [0.46269908 0.44387755 0.46921797 0.46296296 0.47088608 0.47088186
0.48270181 0.46179402 0.45144804 0.46062659]
mean value: 0.4637095962146237
key: test_precision
value: [0.64444444 0.69767442 0.49090909 0.72727273 0.63333333 0.52
0.48076923 0.59574468 0.65384615 0.62162162]
mean value: 0.6065615701652318
key: train_precision
value: [0.66666667 0.65743073 0.66666667 0.67237164 0.68550369 0.66745283
0.67201835 0.65258216 0.66919192 0.67661692]
mean value: 0.6686501560509168
key: test_recall
value: [0.3372093 0.34883721 0.31395349 0.37209302 0.2183908 0.29885057
0.28735632 0.32183908 0.3908046 0.26744186]
mean value: 0.31567762630312757
key: train_recall
value: [0.35430039 0.33504493 0.36200257 0.35301669 0.35861183 0.36375321
0.37660668 0.35732648 0.34061697 0.3491656 ]
mean value: 0.3550445333975732
key: test_accuracy
value: [0.79778393 0.80886427 0.75900277 0.81717452 0.78116343 0.76454294
0.7534626 0.78393352 0.8033241 0.78611111]
mean value: 0.7855363188673438
key: train_accuracy
value: [0.80264778 0.79864532 0.80357143 0.80357143 0.80695813 0.80418719
0.80665025 0.80049261 0.80172414 0.80393967]
mean value: 0.8032387949607838
key: test_roc_auc
value: [0.63951374 0.65078224 0.60606765 0.66422833 0.58912241 0.60562967
0.59440809 0.62624801 0.66255558 0.60817348]
mean value: 0.6246729206499048
key: train_roc_auc
value: [0.64920366 0.63998095 0.65244721 0.64937185 0.65339498 0.6533341
0.65935597 0.64870372 0.64379026 0.64826701]
mean value: 0.6497849720180245
key: test_jcc
value: [0.28431373 0.3030303 0.23684211 0.32653061 0.19387755 0.23423423
0.21929825 0.26415094 0.32380952 0.23 ]
mean value: 0.26160872441029825
key: train_jcc
value: [0.30098146 0.2852459 0.30652174 0.30120482 0.30794702 0.30794342
0.31813246 0.30021598 0.29152915 0.29922992]
mean value: 0.301895188129983
MCC on Blind test: 0.34
Accuracy on Blind test: 0.66
Running classifier: 12
Model_name: Logistic RegressionCV
Model func: LogisticRegressionCV(cv=3, random_state=42)
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
warnings.warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
warnings.warn(
Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', LogisticRegressionCV(cv=3, random_state=42))])
key: fit_time
value: [1.02097058 0.88417315 1.01830864 0.90714455 1.10207033 0.90377951
1.03620458 0.89738536 0.89795732 1.00278687]
mean value: 0.9670780897140503
key: score_time
value: [0.01368022 0.01371884 0.01379108 0.01370335 0.0136342 0.01369357
0.01675415 0.0137496 0.01365113 0.01371026]
mean value: 0.014008641242980957
key: test_mcc
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: train_mcc
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: test_fscore
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: train_fscore
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: test_precision
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: train_precision
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: test_recall
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: train_recall
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: test_accuracy
value: [0.76177285 0.76177285 0.76177285 0.76177285 0.75900277 0.75900277
0.75900277 0.75900277 0.75900277 0.76111111]
mean value: 0.7603216374269006
key: train_accuracy
value: [0.7601601 0.7601601 0.7601601 0.7601601 0.76046798 0.76046798
0.76046798 0.76046798 0.76046798 0.76023392]
mean value: 0.7603214213695157
key: test_roc_auc
value: [0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5]
mean value: 0.5
key: train_roc_auc
value: [0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5]
mean value: 0.5
key: test_jcc
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: train_jcc
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
MCC on Blind test: 0.0
Accuracy on Blind test: 0.41
Running classifier: 13
Model_name: MLP
Model func: MLPClassifier(max_iter=500, random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', MLPClassifier(max_iter=500, random_state=42))])
key: fit_time
value: [11.97634649 8.11275721 6.38592482 7.78677988 8.41906404 10.58547211
9.33912277 3.84677029 9.52769971 2.42708135]
mean value: 7.840701866149902
key: score_time
value: [0.01727247 0.01387978 0.01392674 0.01391149 0.01398492 0.01397777
0.01584888 0.01405358 0.01396585 0.01413274]
mean value: 0.014495420455932616
key: test_mcc
value: [0.43488579 0.32891134 0.29418078 0.35398347 0.28094981 0.17464503
0.19372989 0.26651972 0.3518632 0.26458482]
mean value: 0.29442538579849253
key: train_mcc
value: [0.69158493 0.51239674 0.54021123 0.48278196 0.62254689 0.62609861
0.64824474 0.41875285 0.60076673 0.42412878]
mean value: 0.5567513465710225
key: test_fscore
value: [0.55900621 0.37288136 0.41428571 0.4 0.44171779 0.31654676
0.36708861 0.3220339 0.45714286 0.33333333]
mean value: 0.3984036531775225
key: train_fscore
value: [0.76516129 0.53273543 0.60524226 0.51075269 0.7082495 0.67711599
0.71654676 0.43584906 0.6661597 0.48495271]
mean value: 0.6102765368228498
key: test_precision
value: [0.6 0.6875 0.53703704 0.70588235 0.47368421 0.42307692
0.4084507 0.61290323 0.60377358 0.58823529]
mean value: 0.5640543332636563
key: train_precision
value: [0.769131 0.88392857 0.79375 0.84569733 0.74053296 0.86746988
0.81372549 0.81914894 0.81564246 0.734375 ]
mean value: 0.8083401622820119
key: test_recall
value: [0.52325581 0.25581395 0.3372093 0.27906977 0.4137931 0.25287356
0.33333333 0.2183908 0.36781609 0.23255814]
mean value: 0.32141138732959107
key: train_recall
value: [0.76123235 0.38125802 0.48908858 0.36585366 0.67866324 0.55526992
0.64010283 0.29691517 0.56298201 0.36200257]
mean value: 0.5093368335252829
key: test_accuracy
value: [0.8033241 0.79501385 0.77285319 0.80055402 0.74792244 0.73684211
0.72299169 0.77839335 0.78947368 0.77777778]
mean value: 0.7725146198830409
key: train_accuracy
value: [0.88793103 0.8395936 0.84698276 0.83189655 0.86607143 0.87315271
0.87869458 0.8158867 0.8648399 0.81563558]
mean value: 0.852068484126226
key: test_roc_auc
value: [0.70708245 0.60972516 0.62315011 0.62135307 0.63390385 0.57169226
0.59002433 0.58729759 0.64558688 0.59073162]
mean value: 0.6180547314882859
key: train_roc_auc
value: [0.84456919 0.68273108 0.72449568 0.67239625 0.80188223 0.76427464
0.79697449 0.6381337 0.76145052 0.66035351]
mean value: 0.7347261283375879
key: test_jcc
value: [0.38793103 0.22916667 0.26126126 0.25 0.28346457 0.18803419
0.2248062 0.19191919 0.2962963 0.2 ]
mean value: 0.25128794071398847
key: train_jcc
value: [0.61964472 0.36308068 0.43394077 0.34296029 0.5482866 0.51184834
0.55829596 0.27864897 0.49942987 0.32009081]
mean value: 0.4476227035847935
MCC on Blind test: 0.28
Accuracy on Blind test: 0.61
Running classifier: 14
Model_name: Multinomial
Model func: MultinomialNB()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', MultinomialNB())])
key: fit_time
value: [0.02586102 0.02621675 0.02607298 0.02615142 0.02591133 0.02595377
0.02594233 0.02602625 0.02639484 0.02661085]
mean value: 0.02611415386199951
key: score_time
value: [0.01340342 0.01343513 0.013376 0.01339364 0.0133667 0.01336098
0.01336455 0.01338482 0.01342845 0.01339436]
mean value: 0.013390803337097168
key: test_mcc
value: [0.17488373 0.12212632 0.10140747 0.183093 0.1012647 0.18989966
0.13425491 0.25297802 0.17380655 0.05880548]
mean value: 0.14925198365714756
key: train_mcc
value: [0.16803579 0.17137991 0.16947462 0.16127878 0.16396564 0.15650459
0.17081107 0.15704526 0.15393093 0.16369601]
mean value: 0.16361225925591208
key: test_fscore
value: [0.33103448 0.28965517 0.30487805 0.34666667 0.2739726 0.34899329
0.29370629 0.39455782 0.30882353 0.25165563]
mean value: 0.3143943537336281
key: train_fscore
value: [0.31595577 0.34664765 0.3338301 0.3255814 0.33790614 0.31293571
0.32323232 0.31755725 0.31114551 0.34253362]
mean value: 0.3267325460415784
key: test_precision
value: [0.40677966 0.3559322 0.32051282 0.40625 0.33898305 0.41935484
0.375 0.48333333 0.42857143 0.29230769]
mean value: 0.38270250286891894
key: train_precision
value: [0.41067762 0.39004815 0.39786856 0.39169675 0.38550247 0.39376218
0.4086444 0.39097744 0.39105058 0.38170347]
mean value: 0.39419316368338675
key: test_recall
value: [0.27906977 0.24418605 0.29069767 0.30232558 0.22988506 0.29885057
0.24137931 0.33333333 0.24137931 0.22093023]
mean value: 0.26820368885324775
key: train_recall
value: [0.25673941 0.31193838 0.28754814 0.27856226 0.30077121 0.2596401
0.26735219 0.26735219 0.25835476 0.31065469]
mean value: 0.27989133124993815
key: test_accuracy
value: [0.73130194 0.71468144 0.68421053 0.72853186 0.70637119 0.73130194
0.72022161 0.7534626 0.73961219 0.68611111]
mean value: 0.7195806401969838
key: train_accuracy
value: [0.73337438 0.7179803 0.72475369 0.72321429 0.71767241 0.72690887
0.73183498 0.72475369 0.72598522 0.71406587]
mean value: 0.7240543698932753
key: test_roc_auc
value: [0.57589852 0.55300211 0.5489852 0.58207188 0.54377465 0.58373186
0.55682104 0.61009732 0.56959476 0.52652351]
mean value: 0.5650500859661063
key: train_roc_auc
value: [0.57024901 0.57901496 0.57512279 0.57103488 0.57487953 0.56686459
0.57274492 0.56808905 0.56581705 0.57597512]
mean value: 0.5719791890072166
key: test_jcc
value: [0.19834711 0.16935484 0.17985612 0.20967742 0.15873016 0.21138211
0.17213115 0.24576271 0.1826087 0.14393939]
mean value: 0.18717897021587018
key: train_jcc
value: [0.18761726 0.2096635 0.20035778 0.19444444 0.20330148 0.18549128
0.19277108 0.18874773 0.18423465 0.20666097]
mean value: 0.1953290179756771
MCC on Blind test: 0.2
Accuracy on Blind test: 0.56
Running classifier: 15
Model_name: Naive Bayes
Model func: BernoulliNB()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', BernoulliNB())])
key: fit_time
value: [0.02919626 0.02910233 0.02903605 0.02915931 0.02906704 0.04481792
0.03842282 0.02901316 0.02905917 0.02962947]
mean value: 0.031650352478027347
key: score_time
value: [0.0138886 0.01379967 0.01387453 0.01384401 0.01585865 0.01390767
0.01380467 0.01390791 0.01390576 0.0138731 ]
mean value: 0.014066457748413086
key: test_mcc
value: [0.16930648 0.14290242 0.08541178 0.10723819 0.14177282 0.19271679
0.08756853 0.0276401 0.20546053 0.17058961]
mean value: 0.13306072460474103
key: train_mcc
value: [0.14496124 0.15800117 0.16021218 0.15290535 0.16673177 0.16201368
0.16671001 0.15310928 0.15726283 0.15756462]
mean value: 0.15794721228134728
key: test_fscore
value: [0.29230769 0.29787234 0.24637681 0.23622047 0.28148148 0.32352941
0.24637681 0.18320611 0.34285714 0.33557047]
mean value: 0.27857987411347923
key: train_fscore
value: [0.27480916 0.31118061 0.30081301 0.28949545 0.31319555 0.29818781
0.3050571 0.28879668 0.2955665 0.31399845]
mean value: 0.29911003188770524
key: test_precision
value: [0.43181818 0.38181818 0.32692308 0.36585366 0.39583333 0.44897959
0.33333333 0.27272727 0.45283019 0.3968254 ]
mean value: 0.3806942215831342
key: train_precision
value: [0.405 0.398 0.41019956 0.40697674 0.41041667 0.41513761
0.41741071 0.40749415 0.40909091 0.39494163]
mean value: 0.40746679848895645
key: test_recall
value: [0.22093023 0.24418605 0.19767442 0.1744186 0.2183908 0.25287356
0.1954023 0.13793103 0.27586207 0.29069767]
mean value: 0.22083667468591286
key: train_recall
value: [0.20795892 0.25545571 0.23748395 0.22464698 0.25321337 0.23264781
0.2403599 0.22365039 0.23136247 0.2605905 ]
mean value: 0.23673700050489882
key: test_accuracy
value: [0.74515235 0.72576177 0.71191136 0.73130194 0.73130194 0.74515235
0.71191136 0.70360111 0.74515235 0.725 ]
mean value: 0.727624653739612
key: train_accuracy
value: [0.73676108 0.72875616 0.73522167 0.73552956 0.73399015 0.73768473
0.73768473 0.73614532 0.73583744 0.72699292]
mean value: 0.734460375833716
key: test_roc_auc
value: [0.56501057 0.56027484 0.53520085 0.53993658 0.55627569 0.57716671
0.53565735 0.51057136 0.58501133 0.57600577]
mean value: 0.5541111044298842
key: train_roc_auc
value: [0.55578181 0.566772 0.56487401 0.56068315 0.56931923 0.56470447
0.56734594 0.56061062 0.56304965 0.56733978]
mean value: 0.5640480686003684
key: test_jcc
value: [0.17117117 0.175 0.14049587 0.13392857 0.1637931 0.19298246
0.14049587 0.10084034 0.20689655 0.2016129 ]
mean value: 0.16272168288099575
key: train_jcc
value: [0.15929204 0.18425926 0.17703349 0.16924565 0.18567389 0.17521781
0.17998075 0.16876819 0.1734104 0.18623853]
mean value: 0.17591200138843663
MCC on Blind test: 0.15
Accuracy on Blind test: 0.51
Running classifier: 16
Model_name: Passive Aggresive
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model',
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
key: fit_time
value: [0.04925942 0.04360366 0.06474972 0.05523849 0.05882573 0.05876207
0.05700588 0.04341865 0.05431557 0.05208445]
mean value: 0.053726363182067874
key: score_time
value: [0.01205039 0.01359773 0.01276278 0.01361132 0.01361179 0.01365495
0.01399994 0.01348567 0.01371884 0.0134263 ]
mean value: 0.013391971588134766
key: test_mcc
value: [ 0.22502658 0.40549109 0.10998814 0.25814566 0.3061782 -0.01970061
0.2897927 0.1001233 0.33440966 0.18463201]
mean value: 0.2194086743899193
key: train_mcc
value: [0.20739223 0.34224844 0.17217925 0.20900638 0.37805804 0.09878484
0.39413961 0.24081428 0.29265014 0.30965623]
mean value: 0.264492943253425
key: test_fscore
value: [0.14893617 0.46875 0.08602151 0.22 0.41176471 0.38139535
0.48888889 0.16513761 0.50980392 0.25862069]
mean value: 0.313931884510026
key: train_fscore
value: [0.1738149 0.43521595 0.10551559 0.22455404 0.49045073 0.39833809
0.55714286 0.28346457 0.48778709 0.36245353]
mean value: 0.35187373425068474
key: test_precision
value: [0.875 0.71428571 0.57142857 0.78571429 0.57142857 0.23906706
0.39855072 0.40909091 0.35616438 0.5 ]
mean value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
0.5420730215540963
key: train_precision
value: [0.71962617 0.61647059 0.8 0.61494253 0.60451977 0.24959323
0.46192893 0.60504202 0.34045802 0.65656566]
mean value: 0.566914691322623
key: test_recall
value: [0.08139535 0.34883721 0.04651163 0.12790698 0.32183908 0.94252874
0.63218391 0.10344828 0.89655172 0.1744186 ]
mean value: 0.36756214915797913
key: train_recall
value: [0.09884467 0.33632863 0.05648267 0.13735558 0.4125964 0.98586118
0.70179949 0.18508997 0.85989717 0.25032092]
mean value: 0.40245766934736044
key: test_accuracy
value: [0.77839335 0.81163435 0.76454294 0.78393352 0.77839335 0.26315789
0.68144044 0.74792244 0.58448753 0.76111111]
mean value: 0.6955016928285627
key: train_accuracy
value: [0.77463054 0.79064039 0.7703202 0.77247537 0.79464286 0.28663793
0.73275862 0.77586207 0.56742611 0.78885811]
mean value: 0.7054252198857701
key: test_roc_auc
value: [0.53887949 0.65260042 0.51780127 0.55849894 0.62259837 0.494987
0.6646321 0.52800151 0.69097659 0.55983704]
mean value: 0.5828812736499916
key: train_roc_auc
value: [0.543347 0.635155 0.52601371 0.55510955 0.66378808 0.52612897
0.7221548 0.57351665 0.66760041 0.60451269]
mean value: 0.6017326856648371
key: test_jcc
value: [0.08045977 0.30612245 0.04494382 0.12359551 0.25925926 0.23563218
0.32352941 0.09 0.34210526 0.14851485]
mean value: 0.1954162514512285
key: train_jcc
value: [0.09517923 0.27813163 0.0556962 0.12647754 0.32489879 0.24870298
0.38613861 0.16513761 0.32256509 0.22133939]
mean value: 0.22242670881188334
MCC on Blind test: 0.2
Accuracy on Blind test: 0.54
Running classifier: 17
Model_name: QDA
Model func: QuadraticDiscriminantAnalysis()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', QuadraticDiscriminantAnalysis())])
key: fit_time
value: [0.06803155 0.06753898 0.06708241 0.10242867 0.08745909 0.06451035
0.06895828 0.06940317 0.06996775 0.07108045]
mean value: 0.07364606857299805
key: score_time
value: [0.01680589 0.01475239 0.01498699 0.02450442 0.01536298 0.01747656
0.01516342 0.01495719 0.01505256 0.01495886]
mean value: 0.016402125358581543
key: test_mcc
value: [0.05723908 0.07318448 0.10131482 0.02564381 0.08486984 0.0485607
0.03222683 0.08939848 0.06411465 0.1074385 ]
mean value: 0.06839912116238177
key: train_mcc
value: [0.11377213 0.11377213 0.11876958 0.11330867 0.11413754 0.11505581
0.11642183 0.11228205 0.11596799 0.11562167]
mean value: 0.11491093904976361
key: test_fscore
value: [0.38979118 0.39170507 0.39810427 0.38515081 0.39722864 0.39179954
0.38990826 0.39906103 0.39443155 0.39906103]
mean value: 0.39362413888522213
key: train_fscore
value: [0.39958964 0.39958964 0.40072016 0.39948718 0.3992815 0.39948652
0.39979445 0.39887208 0.39969175 0.40020576]
mean value: 0.3996718690586587
key: test_precision
value: [0.24347826 0.24425287 0.25 0.24057971 0.24855491 0.24431818
0.24355301 0.25073746 0.24709302 0.25 ]
mean value: 0.2462567434669337
key: train_precision
value: [0.24967949 0.24967949 0.25056288 0.24959949 0.24943892 0.24959897
0.24983943 0.24911944 0.24975923 0.25024124]
mean value: 0.24975185756700627
key: test_recall
value: [0.97674419 0.98837209 0.97674419 0.96511628 0.98850575 0.98850575
0.97701149 0.97701149 0.97701149 0.98837209]
mean value: 0.9803394814220798
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1.
1. 1. 0.9987163]
mean value: 0.9998716302952504
key: test_accuracy
value: [0.27146814 0.26869806 0.29639889 0.26592798 0.27700831 0.26038781
0.26315789 0.29085873 0.27700831 0.28888889]
mean value: 0.27598030163127113
key: train_accuracy
value: [0.27924877 0.27924877 0.28263547 0.27894089 0.27924877 0.27986453
0.28078818 0.27801724 0.2804803 0.28224069]
mean value: 0.2800713595846846
key: test_roc_auc
value: [0.51382664 0.51600423 0.53019027 0.5061945 0.51980032 0.50885141
0.50675392 0.5250021 0.51587801 0.52885758]
mean value: 0.5171358986407014
key: train_roc_auc
value: [0.52592143 0.52592143 0.52814905 0.52571891 0.52611336 0.52651822
0.52712551 0.52530364 0.52692308 0.5274958 ]
mean value: 0.5265190423060905
key: test_jcc
value: [0.24207493 0.24355301 0.24852071 0.23850575 0.24783862 0.24362606
0.24216524 0.24926686 0.24566474 0.24926686]
mean value: 0.24504827791629422
key: train_jcc
value: [0.24967949 0.24967949 0.25056288 0.24959949 0.24943892 0.24959897
0.24983943 0.24911944 0.24975923 0.25016077]
mean value: 0.24974381122504088
MCC on Blind test: 0.06
Accuracy on Blind test: 0.59
Running classifier: 18
Model_name: Random Forest
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model',
RandomForestClassifier(n_estimators=1000, random_state=42))])
key: fit_time
value: [10.41127205 10.4042697 10.31921792 10.17687821 10.39874578 10.35515833
10.53922057 10.0584898 10.21876431 10.03163767]
mean value: 10.291365432739259
key: score_time
value: [0.14592075 0.15447879 0.15557885 0.1460681 0.1443603 0.14466214
0.14293313 0.14338851 0.14047265 0.14903402]
mean value: 0.1466897249221802
key: test_mcc
value: [0.33472698 0.34835799 0.31217723 0.41011736 0.29692883 0.22316781
0.24992411 0.32073864 0.38146509 0.32189216]
mean value: 0.31994962150239104
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.39344262 0.36842105 0.39370079 0.44628099 0.36065574 0.30894309
0.336 0.40310078 0.38596491 0.38016529]
mean value: 0.37766752585860214
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.66666667 0.75 0.6097561 0.77142857 0.62857143 0.52777778
0.55263158 0.61904762 0.81481481 0.65714286]
mean value: 0.659783741195808
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.27906977 0.24418605 0.29069767 0.31395349 0.25287356 0.2183908
0.24137931 0.29885057 0.25287356 0.26744186]
mean value: 0.2659716653301256
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.79501385 0.80055402 0.7867036 0.81440443 0.78393352 0.76454294
0.7700831 0.7867036 0.80609418 0.79166667]
mean value: 0.7899699907663897
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
[0.6177167 0.60936575 0.61625793 0.64243129 0.60271415 0.5781735
0.58966776 0.62022821 0.61731269 0.61182312]
mean value: 0.6105691107008956
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.24489796 0.22580645 0.24509804 0.28723404 0.22 0.18269231
0.20192308 0.25242718 0.23913043 0.23469388]
mean value: 0.23339033739804876
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: 0.23
Accuracy on Blind test: 0.57
Running classifier: 19
Model_name: Random Forest2
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
n_estimators=1000, n_jobs=10, oob_score=True,
random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model',
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
n_estimators=1000, n_jobs=10,
oob_score=True, random_state=42))])
key: fit_time
value: [2.16370225 2.19536853 2.18784618 2.18228626 2.19738269 2.1648066
2.16309762 2.12486482 2.07439351 2.07211781]
mean value: 2.1525866270065306
key: score_time
value: [0.35106611 0.33895445 0.37046909 0.3161478 0.38057756 0.38376713
0.28120399 0.37267017 0.37479353 0.33548594]
mean value: 0.3505135774612427
key: test_mcc
value: [0.32633726 0.38885225 0.34989409 0.40882311 0.2656045 0.19716824
0.23962256 0.29692883 0.35769775 0.27151561]
mean value: 0.31024442116814677
key: train_mcc
value: [0.76119059 0.75430455 0.75159649 0.76060869 0.75062307 0.74680177
0.74861429 0.75674921 0.74700298 0.75361374]
mean value: 0.753110538598853
key: test_fscore
value: [0.36206897 0.35514019 0.37931034 0.42735043 0.28828829 0.27118644
0.31404959 0.36065574 0.37931034 0.30357143]
mean value: 0.34409317514581894
key: train_fscore
value: [0.78376269 0.778125 0.7752545 0.78477078 0.77358491 0.77007874
0.77201258 0.78064012 0.76971609 0.77682067]
mean value: 0.7764766084031725
key: test_precision
value: [0.7 0.9047619 0.73333333 0.80645161 0.66666667 0.51612903
0.55882353 0.62857143 0.75862069 0.65384615]
mean value: 0.6927204351407715
key: train_precision
value: [1. 0.99401198 0.9939759 0.99409449 0.99595142 0.99390244
0.99392713 0.99403579 0.99591837 0.99598394]
mean value: 0.9951801437764031
key: test_recall
value: [0.24418605 0.22093023 0.25581395 0.29069767 0.18390805 0.18390805
0.2183908 0.25287356 0.25287356 0.19767442]
mean value: 0.23012563485699009
key: train_recall
value: [0.64441592 0.63928113 0.63543004 0.64826701 0.63239075 0.6285347
0.6311054 0.64267352 0.62724936 0.63671374]
mean value: 0.6366061558058417
key: test_accuracy
value: [0.79501385 0.80886427 0.80055402 0.81440443 0.78116343 0.76177285
0.7700831 0.78393352 0.80055402 0.78333333]
mean value: 0.7899676823638042
key: train_accuracy
value: [0.91471675 0.91256158 0.91163793 0.91471675 0.91133005 0.91009852
0.91071429 0.91348522 0.91009852 0.9122807 ]
mean value: 0.9121640307665715
key: test_roc_auc
value: [0.60572939 0.60682875 0.61336152 0.63443975 0.57735548 0.56458176
0.58182314 0.60271415 0.61366306 0.58241385]
mean value: 0.5982910855107777
key: train_roc_auc
value: [0.82220796 0.81903303 0.81710749 0.82352597 0.81579051 0.81366006
0.81494541 0.82072947 0.81321982 0.81795201]
mean value: 0.8178171741539231
key: test_jcc
value: [0.22105263 0.21590909 0.23404255 0.27173913 0.16842105 0.15686275
0.18627451 0.22 0.23404255 0.17894737]
mean value: 0.20872916352603918
key: train_jcc
value: [0.64441592 0.63682864 0.63299233 0.64578005 0.63076923 0.62612036
0.62868118 0.64020487 0.62564103 0.63508323]
mean value: 0.6346516825952726
MCC on Blind test: 0.22
Accuracy on Blind test: 0.56
Running classifier: 20
Model_name: Ridge Classifier
Model func: RidgeClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', RidgeClassifier(random_state=42))])
key: fit_time
value: [0.05231833 0.05089331 0.04709482 0.05872607 0.06178594 0.05545545
0.04181433 0.04124594 0.04182744 0.04225469]
mean value: 0.04934163093566894
key: score_time
value: [0.0207901 0.03202271 0.02125096 0.02103305 0.03266883 0.02765298
0.02629399 0.03595972 0.03594208 0.02735376]
mean value: 0.028096818923950197
key: test_mcc
value: [0.32891134 0.36240539 0.34465058 0.38989332 0.22896572 0.27267614
0.23993847 0.32460669 0.30862543 0.22280197]
mean value: 0.3023475034383615
key: train_mcc
value: [0.35775377 0.35421041 0.3698739 0.34864064 0.38069094 0.39041342
0.39168934 0.37440512 0.37868418 0.37715187]
mean value: 0.3723513603519781
key: test_fscore
value: [0.37288136 0.39316239 0.421875 0.43902439 0.26785714 0.36923077
0.34108527 0.41221374 0.390625 0.29059829]
mean value: 0.3698553353800546
key: train_fscore
value: [0.41681574 0.41081081 0.43046944 0.41071429 0.43470483 0.44739169
0.44876325 0.43238434 0.43024302 0.43315508]
mean value: 0.42954524967676944
key: test_precision
value: [0.6875 0.74193548 0.64285714 0.72972973 0.6 0.55813953
0.52380952 0.61363636 0.6097561 0.5483871 ]
mean value: 0.6255750973122618
key: train_precision
value: [0.68731563 0.68882175 0.69428571 0.6744868 0.71470588 0.71671388
0.71751412 0.70231214 0.71771772 0.70845481]
mean value: 0.7022328458897151
key: test_recall
value: [0.25581395 0.26744186 0.31395349 0.31395349 0.17241379 0.27586207
0.25287356 0.31034483 0.28735632 0.19767442]
mean value: 0.26476877840149693
key: train_recall
value: [0.29910141 0.29268293 0.31193838 0.29525032 0.31233933 0.3251928
0.32647815 0.31233933 0.30719794 0.31193838]
mean value: 0.3094458982744339
key: test_accuracy
value: [0.79501385 0.8033241 0.79501385 0.80886427 0.77285319 0.77285319
0.76454294 0.7867036 0.78393352 0.76944444]
mean value: 0.7852546937519237
key: train_accuracy
value: [0.79926108 0.79864532 0.80203202 0.79679803 0.80541872 0.80757389
0.80788177 0.80357143 0.80511084 0.80424746]
mean value: 0.8030540564205431
key: test_roc_auc
value: [0.60972516 0.61917548 0.62970402 0.63879493 0.56795872 0.6032595
0.58994043 0.62415052 0.61448108 0.57328976]
mean value: 0.6070479592073841
key: train_roc_auc
value: [0.62808453 0.62548282 0.6343005 0.62514642 0.63653404 0.64235349
0.64299616 0.63531946 0.63457063 0.63572628]
mean value: 0.6340514320418333
key: test_jcc
value: [0.22916667 0.24468085 0.26732673 0.28125 0.15463918 0.22641509
0.20560748 0.25961538 0.24271845 0.17 ]
mean value: 0.22814198278539588
key: train_jcc
value: [0.26327684 0.2585034 0.27426637 0.25842697 0.27771429 0.2881549
0.28929385 0.27582293 0.27408257 0.27645051]
mean value: 0.2735992611609348
MCC on Blind test: 0.29
Accuracy on Blind test: 0.62
Running classifier: 21
Model_name: Ridge ClassifierCV
Model func: RidgeClassifierCV(cv=3)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', RidgeClassifierCV(cv=3))])
key: fit_time
value: [0.19024944 0.12882352 0.2067554 0.21828508 0.21111393 0.2256968
0.21383214 0.21399522 0.21118236 0.21185756]
mean value: 0.20317914485931396
key: score_time
value: [0.02378464 0.01347446 0.02106023 0.02078986 0.02092433 0.02253103
0.02097273 0.02110553 0.02219486 0.02076125]
mean value: 0.020759892463684083
key: test_mcc
value: [0.29245056 0.33538844 0.30023732 0.34989409 0.24735173 0.20155161
0.23993847 0.28001654 0.30280418 0.173294 ]
mean value: 0.2722926950714176
key: train_mcc
value: [0.31929908 0.32360407 0.33184142 0.32195532 0.33792752 0.35480859
0.35140184 0.31627557 0.32369005 0.34388802]
mean value: 0.332469148098823
key: test_fscore
value: [0.2962963 0.3539823 0.36363636 0.37931034 0.25925926 0.29268293
0.34108527 0.33613445 0.35294118 0.22018349]
mean value: 0.31955118795421916
key: train_fscore
value: [0.3610586 0.36121673 0.37802607 0.36738519 0.37582625 0.39515377
0.39405204 0.36037736 0.36034318 0.3796034 ]
mean value: 0.37330426062260835
key: test_precision
value: [0.72727273 0.74074074 0.62857143 0.73333333 0.66666667 0.5
0.52380952 0.625 0.65625 0.52173913]
mean value: 0.6323383550829202
key: train_precision
value: [0.68458781 0.6959707 0.68813559 0.68055556 0.70818505 0.71864407
0.7114094 0.67730496 0.69741697 0.71785714]
mean value: 0.6980067257083101
key: test_recall
value: [0.18604651 0.23255814 0.25581395 0.25581395 0.16091954 0.20689655
0.25287356 0.22988506 0.24137931 0.13953488]
mean value: 0.21617214648489705
key: train_recall
value: [0.24518614 0.24390244 0.2605905 0.25160462 0.25578406 0.27249357
0.27249357 0.24550129 0.24293059 0.25802311]
mean value: 0.2548509888427256
key: test_accuracy
value: [0.78947368 0.79778393 0.7867036 0.80055402 0.77839335 0.75900277
0.76454294 0.78116343 0.7867036 0.76388889]
mean value: 0.7808210218528778
key: train_accuracy
value: [0.79187192 0.79310345 0.79433498 0.7921798 0.79649015 0.80018473
0.79926108 0.79125616 0.79341133 0.79778393]
mean value: 0.794987752957712
key: test_roc_auc
value: [0.58211416 0.6035518 0.60427061 0.61336152 0.56768605 0.57060156
0.58994043 0.59304472 0.60061666 0.54969445]
mean value: 0.5874881966664698
key: train_roc_auc
value: [0.60477209 0.6051428 0.61166423 0.60717129 0.61129284 0.61944517
0.61883788 0.60432959 0.60486611 0.61301965]
mean value: 0.6100541627835477
key: test_jcc
value: [0.17391304 0.21505376 0.22222222 0.23404255 0.14893617 0.17142857
0.20560748 0.2020202 0.21428571 0.12371134]
mean value: 0.19112210571217858
key: train_jcc
value: [0.22029988 0.22041763 0.23306544 0.2250287 0.23139535 0.24622532
0.24537037 0.21979287 0.21976744 0.23426573]
mean value: 0.22956287428240438
MCC on Blind test: 0.31
Accuracy on Blind test: 0.62
Running classifier: 22
Model_name: SVC
Model func: SVC(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', SVC(random_state=42))])
key: fit_time
value: [0.51930189 0.42096686 0.39334679 0.47701263 0.45034361 0.45509982
0.39498878 0.48014569 0.4160614 0.49533176]
mean value: 0.4502599239349365
key: score_time
value: [0.11452413 0.11114311 0.10948944 0.1165514 0.11608434 0.11028051
0.11074924 0.11670637 0.11070395 0.11060667]
mean value: 0.11268391609191894
key: test_mcc
value: [0.12716417 0.18928361 0.16081667 0.18161326 0.0994787 0.10864279
0.12620049 0.15489816 0.18785015 0.09196503]
mean value: 0.1427913027719976
key: train_mcc
value: [0.2098512 0.20435864 0.21827587 0.18728505 0.23032351 0.25070968
0.26563794 0.21707843 0.22266953 0.2167464 ]
mean value: 0.22229362353414425
key: test_fscore
value: [0.06666667 0.08888889 0.12631579 0.10869565 0.06521739 0.08510638
0.12244898 0.08695652 0.08791209 0.04494382]
mean value: 0.08831521809539981
key: train_fscore
value: [0.1179302 0.12410501 0.12903226 0.0973236 0.1465721 0.16216216
0.18894009 0.12484994 0.1272509 0.13317479]
mean value: 0.13513410661412256
key: test_precision
value: [0.75 1. 0.66666667 0.83333333 0.6 0.57142857
0.54545455 0.8 1. 0.66666667]
mean value: 0.7433549783549783
key: train_precision
value: [0.94230769 0.88135593 0.93103448 0.93023256 0.91176471 0.94520548
0.91111111 0.94545455 0.96363636 0.90322581]
mean value: 0.9265328677397278
key: test_recall
value: [0.03488372 0.04651163 0.06976744 0.05813953 0.03448276 0.04597701
0.06896552 0.04597701 0.04597701 0.02325581]
mean value: 0.04739374498797113
key: train_recall
value: [0.06290116 0.06675225 0.06931964 0.05134788 0.07969152 0.08868895
0.10539846 0.06683805 0.06812339 0.07188703]
mean value: 0.07309483188188667
key: test_accuracy
value: [0.76731302 0.77285319 0.7700831 0.77285319 0.76177285 0.76177285
0.76177285 0.76731302 0.7700831 0.76388889]
mean value: 0.7669706063404125
key: train_accuracy
value: [0.77432266 0.77401478 0.77555419 0.77155172 0.77770936 0.7804803
0.78325123 0.77555419 0.77616995 0.77562327]
mean value: 0.7764231643082298
key: test_roc_auc
value: [0.51562368 0.52325581 0.52942918 0.52725159 0.51359174 0.51751405
0.52535867 0.52116369 0.52298851 0.50980309]
mean value: 0.520598000562997
key: train_roc_auc
value: [0.53084304 0.53195855 0.53384978 0.52506641 0.53863118 0.54353476
0.5510798 0.53281174 0.53365684 0.53472894]
mean value: 0.535616102471739
key: test_jcc
value: [0.03448276 0.04651163 0.06741573 0.05747126 0.03370787 0.04444444
0.06521739 0.04545455 0.04597701 0.02298851]
mean value: 0.04636711448458175
key: train_jcc
value: [0.06265985 0.06615776 0.06896552 0.0511509 0.07908163 0.08823529
0.1043257 0.06658131 0.06794872 0.07133758]
mean value: 0.07264442498443416
MCC on Blind test: 0.17
Accuracy on Blind test: 0.47
Running classifier: 23
Model_name: Stochastic GDescent
Model func: SGDClassifier(n_jobs=10, random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', SGDClassifier(n_jobs=10, random_state=42))])
key: fit_time
value: [0.09215426 0.11704373 0.10246086 0.12951922 0.10674119 0.11783433
0.09352946 0.12433505 0.09088969 0.10864544]
mean value: 0.1083153247833252
key: score_time
value: [0.01140213 0.01124811 0.01148105 0.01113343 0.01151824 0.0111196
0.01145673 0.01130891 0.01169872 0.01122808]
mean value: 0.011359500885009765
key: test_mcc
value: [0.40357057 0.25718686 0.17360755 0.30500099 0.31480131 0.34895448
0.28376941 0.10897158 0.35799979 0.15000239]
mean value: 0.27038649298412887
key: train_mcc
value: [0.37908368 0.25480481 0.29507085 0.23727574 0.27695461 0.41172483
0.34215422 0.27485762 0.36997655 0.2474239 ]
mean value: 0.30893268302939364
key: test_fscore
value: [0.5 0.46290801 0.22018349 0.28571429 0.49535604 0.47682119
0.48648649 0.15238095 0.49032258 0.14285714]
mean value: 0.371303017539668
key: train_fscore
value: [0.49082569 0.46366327 0.30535895 0.21615721 0.47563884 0.51166407
0.51773879 0.26708728 0.4906786 0.20888889]
mean value: 0.39477015815493616
key: test_precision
value: [0.64814815 0.31075697 0.52173913 0.78947368 0.33898305 0.5625
0.34449761 0.44444444 0.55882353 0.58333333]
mean value: 0.5102699900597513
key: train_precision
value: [0.60680529 0.31167109 0.71904762 0.72262774 0.32359759 0.6476378
0.37157247 0.73410405 0.58436945 0.7768595 ]
mean value: 0.5798292588909139
key: test_recall
value: [0.40697674 0.90697674 0.13953488 0.1744186 0.91954023 0.4137931
0.82758621 0.09195402 0.43678161 0.08139535]
mean value: 0.43989574979951884
key: train_recall
value: [0.41206675 0.90500642 0.19383825 0.12708601 0.89717224 0.42287918
0.85347044 0.16323907 0.42287918 0.12066752]
mean value: 0.4518305057898367
key: test_accuracy
value: [0.80609418 0.49861496 0.76454294 0.79224377 0.54847645 0.78116343
0.57894737 0.7534626 0.78116343 0.76666667]
mean value: 0.7071375807940905
key: train_accuracy
value: [0.79495074 0.49784483 0.78848522 0.77894089 0.52616995 0.80665025
0.61915025 0.7854064 0.78971675 0.78085565]
mean value: 0.716817091882762
key: test_roc_auc
value: [0.66894292 0.63894292 0.54976744 0.57993658 0.67509858 0.65580166
0.6637931 0.52772884 0.66364628 0.53157359]
mean value: 0.6155231900955125
key: train_roc_auc
value: [0.66391106 0.63719337 0.58497097 0.55584758 0.65324199 0.67520477
0.69940728 0.5723078 0.66407117 0.55486817]
mean value: 0.6261024158204581
key: test_jcc
value: [0.33333333 0.3011583 0.12371134 0.16666667 0.32921811 0.31304348
0.32142857 0.08247423 0.32478632 0.07692308]
mean value: 0.23727434265633382
key: train_jcc
value: [0.32522796 0.30179795 0.18019093 0.12117503 0.31202503 0.34378265
0.34928985 0.15412621 0.32509881 0.11662531]
mean value: 0.2529339743217077
MCC on Blind test: 0.31
Accuracy on Blind test: 0.66
Running classifier: 24
Model_name: XGBoost
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
colsample_bynode=None, colsample_bytree=None,
enable_categorical=False, gamma=None, gpu_id=None,
importance_type=None, interaction_constraints=None,
learning_rate=None, max_delta_step=None, max_depth=None,
min_child_weight=None, missing=nan, monotone_constraints=None,
n_estimators=100, n_jobs=None, num_parallel_tree=None,
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
scale_pos_weight=None, subsample=None, tree_method=None,
use_label_encoder=False, validate_parameters=None, verbosity=0)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', '...
interaction_constraints=None, learning_rate=None,
max_delta_step=None, max_depth=None,
min_child_weight=None, missing=nan,
monotone_constraints=None, n_estimators=100,
n_jobs=None, num_parallel_tree=None,
predictor=None, random_state=42, reg_alpha=None,
reg_lambda=None, scale_pos_weight=None,
subsample=None, tree_method=None,
use_label_encoder=False,
validate_parameters=None, verbosity=0))])
key: fit_time
value: [0.45673394 0.48911667 0.42695117 0.5678823 0.41514325 0.44388843
0.45433116 0.59146976 0.45420647 0.42519808]
mean value: 0.4724921226501465
key: score_time
value: [0.01321292 0.01254225 0.0123415 0.01266503 0.01248455 0.01243067
0.01235414 0.01236892 0.01330924 0.01228046]
mean value: 0.012598967552185059
key: test_mcc
value: [0.45581807 0.43867074 0.41244377 0.43241641 0.36294044 0.34143507
0.33671283 0.3738926 0.47915936 0.39239667]
mean value: 0.40258859604684727
key: train_mcc
value: [1. 1. 1. 0.99915563 1. 0.99915488
1. 1. 1. 1. ]
mean value: 0.9998310507924867
key: test_fscore
value: [0.56774194 0.53793103 0.53246753 0.5248227 0.46808511 0.47741935
0.47058824 0.47887324 0.57142857 0.5 ]
mean value: 0.5129357704850619
key: train_fscore
value: [1. 1. 1. 0.99935774 1. 0.99935691
1. 1. 1. 1. ]
mean value: 0.9998714652425413
key: test_precision
value: [0.63768116 0.66101695 0.60294118 0.67272727 0.61111111 0.54411765
0.54545455 0.61818182 0.7 0.62068966]
mean value: 0.6213921334749405
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.51162791 0.45348837 0.47674419 0.43023256 0.37931034 0.42528736
0.4137931 0.3908046 0.48275862 0.41860465]
mean value: 0.43826516974071106
key: train_recall
value: [1. 1. 1. 0.9987163 1. 0.99871465
1. 1. 1. 1. ]
mean value: 0.99974309559088
key: test_accuracy
value: [0.81440443 0.81440443 0.80055402 0.81440443 0.79224377 0.77562327
0.77562327 0.79501385 0.82548476 0.8 ]
mean value: 0.800775623268698
key: train_accuracy
value: [1. 1. 1. 0.99969212 1. 0.99969212
1. 1. 1. 1. ]
mean value: 0.9999384236453203
key: test_roc_auc
value: [0.71035941 0.69038055 0.68928118 0.68238901 0.651334 0.65607434
0.65215203 0.65708113 0.7085326 0.66915634]
mean value: 0.6766740579957704
key: train_roc_auc
value: [1. 1. 1. 0.99935815 1. 0.99935733
1. 1. 1. 1. ]
mean value: 0.9998715477954401
key: test_jcc
value: [0.3963964 0.36792453 0.36283186 0.35576923 0.30555556 0.31355932
0.30769231 0.31481481 0.4 0.33333333]
mean value: 0.3457877347304503
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
from pandas import MultiIndex, Int64Index
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
from pandas import MultiIndex, Int64Index
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
from pandas import MultiIndex, Int64Index
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
from pandas import MultiIndex, Int64Index
/home/tanu/git/LSHTM_analysis/scripts/ml/ml_functions/MultClfs_logo_skf.py:419: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
scoresDF_CV['source_data'] = 'CV'
/home/tanu/git/LSHTM_analysis/scripts/ml/ml_functions/MultClfs_logo_skf.py:446: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
scoresDF_BT['source_data'] = 'BT'
key: train_jcc
value: [1. 1. 1. 0.9987163 1. 0.99871465
1. 1. 1. 1. ]
mean value: 0.99974309559088
MCC on Blind test: 0.24
Accuracy on Blind test: 0.61
Extracting tts_split_name: logo_skf_BT_pnca
Total cols in each df:
CV df: 8
metaDF: 17
Adding column: Model_name
Total cols in bts df:
BT_df: 8
First proceeding to rowbind CV and BT dfs:
Final output should have: 25 columns
Combinig 2 using pd.concat by row ~ rowbind
Checking Dims of df to combine:
Dim of CV: (24, 8)
Dim of BT: (24, 8)
8
Number of Common columns: 8
These are: ['Precision', 'Accuracy', 'source_data', 'F1', 'Recall', 'MCC', 'ROC_AUC', 'JCC']
Concatenating dfs with different resampling methods [WF]:
Split type: logo_skf_BT_pnca
No. of dfs combining: 2
PASS: 2 dfs successfully combined
nrows in combined_df_wf: 48
ncols in combined_df_wf: 8
PASS: proceeding to merge metadata with CV and BT dfs
Adding column: Model_name
=========================================================
SUCCESS: Ran multiple classifiers
=======================================================
BTS gene: gid
Total genes: 6
Training on: 5
Training on genes: ['alr', 'katg', 'pnca', 'rpob', 'embb']
Omitted genes: ['gid']
Blind test gene: gid
/home/tanu/git/Data/ml_combined/6genes_logo_skf_BT_gid.csv
Training data dim: (3502, 171)
Training Target dim: (3502,)
Checked training df does NOT have Target var
TEST data dim: (531, 171)
TEST Target dim: (531,)
==============================================================
Running several classification models (n): 24
List of models:
('AdaBoost Classifier', AdaBoostClassifier(random_state=42))
('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42))
('Decision Tree', DecisionTreeClassifier(random_state=42))
('Extra Tree', ExtraTreeClassifier(random_state=42))
('Extra Trees', ExtraTreesClassifier(random_state=42))
('Gradient Boosting', GradientBoostingClassifier(random_state=42))
('Gaussian NB', GaussianNB())
('Gaussian Process', GaussianProcessClassifier(random_state=42))
('K-Nearest Neighbors', KNeighborsClassifier())
('LDA', LinearDiscriminantAnalysis())
('Logistic Regression', LogisticRegression(random_state=42))
('Logistic RegressionCV', LogisticRegressionCV(cv=3, random_state=42))
('MLP', MLPClassifier(max_iter=500, random_state=42))
('Multinomial', MultinomialNB())
('Naive Bayes', BernoulliNB())
('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42))
('QDA', QuadraticDiscriminantAnalysis())
('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42))
('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
n_estimators=1000, n_jobs=10, oob_score=True,
random_state=42))
('Ridge Classifier', RidgeClassifier(random_state=42))
('Ridge ClassifierCV', RidgeClassifierCV(cv=3))
('SVC', SVC(random_state=42))
('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42))
('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
colsample_bynode=None, colsample_bytree=None,
enable_categorical=False, gamma=None, gpu_id=None,
importance_type=None, interaction_constraints=None,
learning_rate=None, max_delta_step=None, max_depth=None,
min_child_weight=None, missing=nan, monotone_constraints=None,
n_estimators=100, n_jobs=None, num_parallel_tree=None,
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
scale_pos_weight=None, subsample=None, tree_method=None,
use_label_encoder=False, validate_parameters=None, verbosity=0))
================================================================
Running classifier: 1
Model_name: AdaBoost Classifier
Model func: AdaBoostClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', AdaBoostClassifier(random_state=42))])
key: fit_time
value: [0.76296449 0.76670361 0.76700497 0.76825643 0.77757978 0.80970788
0.79003167 0.7899456 0.84364414 0.82119226]
mean value: 0.78970308303833
key: score_time
value: [0.01881385 0.01921582 0.02059698 0.01871228 0.02091503 0.02099347
0.01932049 0.01925135 0.01894045 0.01899171]
mean value: 0.0195751428604126
key: test_mcc
value: [0.41676598 0.48938108 0.34057905 0.37581731 0.45933091 0.40782721
0.36026294 0.21826634 0.53042818 0.43351089]
mean value: 0.4032169871562676
key: train_mcc
value: [0.49790482 0.47335818 0.4824884 0.49938234 0.49021442 0.50631245
0.48569682 0.53488973 0.47720658 0.49385625]
mean value: 0.4941309971662684
key: test_fscore
value: [0.58 0.62564103 0.53398058 0.54822335 0.59574468 0.57286432
0.54634146 0.41304348 0.65284974 0.56353591]
mean value: 0.5632224555088564
key: train_fscore
value: [0.62847222 0.61264368 0.61918438 0.63012118 0.62678063 0.63899943
0.62828396 0.65306122 0.61243463 0.6253619 ]
mean value: 0.6275343224887017
key: test_precision
value: [0.63043478 0.70114943 0.55555556 0.6 0.69135802 0.62637363
0.57731959 0.5 0.74117647 0.69863014]
mean value: 0.6321997609719994
key: train_precision
value: [0.71541502 0.69130999 0.69909209 0.71559633 0.70063694 0.71139241
0.68536585 0.75067024 0.70079787 0.71240106]
mean value: 0.70826777956983
key: test_recall
value: [0.53703704 0.56481481 0.51401869 0.5046729 0.52336449 0.52777778
0.51851852 0.35185185 0.58333333 0.47222222]
mean value: 0.509761163032191
key: train_recall
value: [0.56037152 0.5500516 0.5556701 0.5628866 0.56701031 0.57997936
0.57997936 0.57791538 0.54385965 0.55727554]
mean value: 0.5634999414850043
key: test_accuracy
value: [0.76068376 0.79202279 0.72571429 0.74571429 0.78285714 0.75714286
0.73428571 0.69142857 0.80857143 0.77428571]
mean value: 0.7572706552706553
key: train_accuracy
value: [0.79625516 0.78609965 0.78965736 0.79663706 0.79219543 0.79854061
0.78902284 0.81123096 0.78838832 0.7947335 ]
mean value: 0.794276089936802
key: test_roc_auc
value: [0.69855967 0.72890947 0.66647437 0.67826237 0.71024191 0.69364096
0.67454852 0.59741353 0.74621212 0.69065657]
mean value: 0.6884919477032193
key: train_roc_auc
value: [0.73068988 0.72048868 0.72467282 0.73171828 0.72965548 0.73776797
0.73089669 0.74635576 0.72039524 0.72870648]
mean value: 0.7301347270441128
key: test_jcc
value: [0.4084507 0.45522388 0.36423841 0.37762238 0.42424242 0.40140845
0.37583893 0.26027397 0.48461538 0.39230769]
mean value: 0.3944222223687734
key: train_jcc
value: [0.45822785 0.44159072 0.4484193 0.45998315 0.45643154 0.4695071
0.45802771 0.48484848 0.44137353 0.45492839]
mean value: 0.4573337777167173
MCC on Blind test: 0.08
Accuracy on Blind test: 0.73
Running classifier: 2
Model_name: Bagging Classifier
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model',
BaggingClassifier(n_jobs=10, oob_score=True,
random_state=42))])
key: fit_time
value: [0.35466862 0.43110561 0.41605783 0.42225742 0.31111526 0.41626239
0.43627977 0.43344069 0.42614102 0.42092538]
mean value: 0.4068253993988037
key: score_time
value: [0.03199792 0.04840517 0.04728079 0.04407668 0.0335815 0.04342532
0.04431272 0.04739046 0.04658294 0.04955792]
mean value: 0.04366114139556885
key: test_mcc
value: [0.46182695 0.38390153 0.38428364 0.33624519 0.35121604 0.4350156
0.37289562 0.19855464 0.49634404 0.38545851]
mean value: 0.3805741762908967
key: train_mcc
value: [0.96196359 0.95078439 0.96574545 0.94642923 0.95165004 0.95384588
0.95758384 0.96048083 0.95381667 0.96134916]
mean value: 0.9563649084455319
key: test_fscore
value: [0.59574468 0.5326087 0.55555556 0.5 0.51086957 0.56830601
0.53191489 0.36686391 0.61621622 0.5 ]
mean value: 0.5278079523363827
key: train_fscore
value: [0.97317201 0.96504237 0.97584034 0.96170213 0.96551724 0.96712619
0.9698253 0.97209057 0.96716102 0.97248677]
mean value: 0.968996395360094
key: test_precision
value: [0.7 0.64473684 0.6043956 0.5974026 0.61038961 0.69333333
0.625 0.50819672 0.74025974 0.7 ]
mean value: 0.6423714449197625
key: train_precision
value: [0.99248927 0.99129489 0.99464668 0.99340659 0.99453552 0.99454744
0.99565217 0.99247312 0.99347116 0.99782845]
mean value: 0.9940345290743124
key: test_recall
value: [0.51851852 0.4537037 0.51401869 0.42990654 0.43925234 0.48148148
0.46296296 0.28703704 0.52777778 0.38888889]
mean value: 0.4503547940463829
key: train_recall
value: [0.95459236 0.94014448 0.95773196 0.93195876 0.93814433 0.94117647
0.94530444 0.95252838 0.94220846 0.94840041]
mean value: 0.9452190056706351
key: test_accuracy
value: [0.78347578 0.75498575 0.74857143 0.73714286 0.74285714 0.77428571
0.74857143 0.69428571 0.79714286 0.76 ]
mean value: 0.754131868131868
key: train_accuracy
value: [0.98381466 0.97905427 0.98540609 0.97715736 0.97937817 0.98032995
0.98191624 0.98318528 0.98032995 0.98350254]
mean value: 0.9814074514254599
key: test_roc_auc
value: [0.70987654 0.6712963 0.68293527 0.65116726 0.65789777 0.69322008
0.66949801 0.58153505 0.72256657 0.65725436]
mean value: 0.669724722126072
key: train_roc_auc
value: [0.97569215 0.96823906 0.97772024 0.9646045 0.96792643 0.96944302
0.97173605 0.97466089 0.96972998 0.97374212]
mean value: 0.9713494431204891
key: test_jcc
value: [0.42424242 0.36296296 0.38461538 0.33333333 0.34306569 0.39694656
0.36231884 0.22463768 0.4453125 0.33333333]
mean value: 0.3610768718542722
key: train_jcc
value: [0.9477459 0.93244626 0.95282051 0.92622951 0.93333333 0.93634497
0.94141829 0.94569672 0.93641026 0.94644696]
mean value: 0.939889272281575
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
MCC on Blind test: 0.01
Accuracy on Blind test: 0.76
Running classifier: 3
Model_name: Decision Tree
Model func: DecisionTreeClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', DecisionTreeClassifier(random_state=42))])
key: fit_time
value: [0.23187184 0.21979785 0.2185154 0.23228812 0.23618817 0.21727228
0.23158073 0.21827698 0.2348454 0.21771502]
mean value: 0.22583518028259278
key: score_time
value: [0.01054025 0.01104379 0.01001501 0.0103147 0.01027775 0.0105648
0.01029396 0.01018524 0.01018333 0.01010728]
mean value: 0.010352611541748047
key: test_mcc
value: [0.33101313 0.33966892 0.20874456 0.24340732 0.29637355 0.27103906
0.29474926 0.1977243 0.26346801 0.24747954]
mean value: 0.26936676571377804
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.54625551 0.5437788 0.46491228 0.48908297 0.50943396 0.5
0.52401747 0.43809524 0.49074074 0.46 ]
mean value: 0.4966316966934354
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.5210084 0.5412844 0.43801653 0.45901639 0.51428571 0.49107143
0.49586777 0.45098039 0.49074074 0.5 ]
mean value: 0.49022717737490995
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.57407407 0.5462963 0.4953271 0.52336449 0.5046729 0.50925926
0.55555556 0.42592593 0.49074074 0.42592593]
mean value: 0.5051142263759086
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.70655271 0.71794872 0.65142857 0.66571429 0.70285714 0.68571429
0.68857143 0.66285714 0.68571429 0.69142857]
mean value: 0.6858787138787139
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: [0.66975309 0.67026749 0.60774586 0.62587977 0.64739818 0.63686103
0.65174472 0.59726048 0.63173401 0.61792164]
mean value: 0.6356566268430235
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.37575758 0.37341772 0.30285714 0.32369942 0.34177215 0.33333333
0.35502959 0.2804878 0.32515337 0.2987013 ]
mean value: 0.3310209410942384
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: 0.06
Accuracy on Blind test: 0.56
Running classifier: 4
Model_name: Extra Tree
Model func: ExtraTreeClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', ExtraTreeClassifier(random_state=42))])
key: fit_time
value: [0.02358341 0.02339959 0.02315235 0.02317953 0.0230186 0.02319551
0.02322984 0.02313638 0.02315593 0.02318048]
mean value: 0.023223161697387695
key: score_time
value: [0.01021409 0.010149 0.01008773 0.01011729 0.01010251 0.01013255
0.01013017 0.01008415 0.01017737 0.01013613]
mean value: 0.010133099555969239
key: test_mcc
value: [0.24438468 0.28049823 0.19136789 0.23928114 0.22552922 0.09838788
0.30364249 0.08365695 0.19364917 0.25106359]
mean value: 0.21114612241226882
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.46889952 0.50678733 0.45689655 0.4845815 0.45410628 0.37788018
0.51851852 0.34653465 0.43269231 0.47887324]
mean value: 0.45257700850071636
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.48514851 0.49557522 0.424 0.45833333 0.47 0.37614679
0.51851852 0.37234043 0.45 0.48571429]
mean value: 0.45357770881793014
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.4537037 0.51851852 0.4953271 0.51401869 0.43925234 0.37962963
0.51851852 0.32407407 0.41666667 0.47222222]
mean value: 0.45319314641744546
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.68376068 0.68945869 0.64 0.66571429 0.67714286 0.61428571
0.70285714 0.62285714 0.66285714 0.68285714]
mean value: 0.6641790801790801
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: [0.61985597 0.64197531 0.5995154 0.62326449 0.61057267 0.54931895
0.65182124 0.54013621 0.59469697 0.62454086]
mean value: 0.6055698072324619
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.30625 0.33939394 0.29608939 0.31976744 0.29375 0.23295455
0.35 0.20958084 0.27607362 0.31481481]
mean value: 0.2938674584953881
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: 0.03
Accuracy on Blind test: 0.62
Running classifier: 5
Model_name: Extra Trees
Model func: ExtraTreesClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', ExtraTreesClassifier(random_state=42))])
key: fit_time
value: [0.54288721 0.55212379 0.54469013 0.54051018 0.57491326 0.55264211
0.55754042 0.54562712 0.54448295 0.54023361]
mean value: 0.5495650768280029
key: score_time
value: [0.02612329 0.02612758 0.02615452 0.02643156 0.02847147 0.02782321
0.02611279 0.02606106 0.02611756 0.02600098]
mean value: 0.026542401313781737
key: test_mcc
value: [0.34279347 0.32062683 0.3967101 0.26722286 0.30666635 0.32693837
0.38245775 0.23477767 0.51127617 0.37661878]
mean value: 0.34660883559709277
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.49723757 0.49197861 0.55497382 0.44808743 0.46927374 0.48618785
0.51428571 0.40909091 0.62702703 0.49101796]
mean value: 0.4989160635166094
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.61643836 0.58227848 0.63095238 0.53947368 0.58333333 0.60273973
0.67164179 0.52941176 0.75324675 0.69491525]
mean value: 0.6204431524935379
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.41666667 0.42592593 0.4953271 0.38317757 0.39252336 0.40740741
0.41666667 0.33333333 0.53703704 0.37962963]
mean value: 0.41876947040498436
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.74074074 0.72934473 0.75714286 0.71142857 0.72857143 0.73428571
0.75714286 0.70285714 0.80285714 0.75714286]
mean value: 0.7421514041514041
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: [0.65072016 0.64506173 0.68387754 0.61957232 0.63453329 0.64378635
0.66287879 0.60055096 0.72926232 0.65262473]
mean value: 0.652286820023769
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.33088235 0.32624113 0.38405797 0.28873239 0.30656934 0.32116788
0.34615385 0.25714286 0.45669291 0.32539683]
mean value: 0.33430375214303676
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: 0.01
Accuracy on Blind test: 0.7
Running classifier: 6
Model_name: Gradient Boosting
Model func: GradientBoostingClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', GradientBoostingClassifier(random_state=42))])
key: fit_time
value: [3.73585415 3.79217792 3.71996355 3.65509772 3.78609657 3.71992517
3.6738801 3.74513841 3.76477146 3.77856445]
mean value: 3.7371469497680665
key: score_time
value: [0.01130843 0.01050067 0.01117325 0.01044488 0.01145053 0.01072693
0.01197934 0.01043487 0.01125741 0.01177025]
mean value: 0.011104655265808106
key: test_mcc
value: [0.48095556 0.51086031 0.44654432 0.44453046 0.48874019 0.47027536
0.38141575 0.31715847 0.53826754 0.4782532 ]
mean value: 0.4557001155804796
key: train_mcc
value: [0.63819682 0.63765706 0.64006687 0.63878187 0.64011059 0.64984579
0.64896941 0.62784229 0.62512567 0.64475072]
mean value: 0.6391347089987157
key: test_fscore
value: [0.62686567 0.62295082 0.58947368 0.59487179 0.61702128 0.61616162
0.53968254 0.48648649 0.65979381 0.59217877]
mean value: 0.594548647470534
key: train_fscore
value: [0.72545561 0.72790698 0.72533963 0.72896111 0.72663139 0.73708648
0.73573923 0.71962617 0.71504425 0.73015873]
mean value: 0.7271949587418625
key: test_precision
value: [0.67741935 0.76 0.6746988 0.65909091 0.71604938 0.67777778
0.62962963 0.58441558 0.74418605 0.74647887]
mean value: 0.6869746353400447
key: train_precision
value: [0.84289617 0.83355526 0.84923928 0.83399734 0.84541724 0.84217507
0.84379172 0.82907133 0.83471074 0.84836066]
mean value: 0.8403214816496163
key: test_recall
value: [0.58333333 0.52777778 0.52336449 0.54205607 0.54205607 0.56481481
0.47222222 0.41666667 0.59259259 0.49074074]
mean value: 0.5255624783662166
key: train_recall
value: [0.63673891 0.64602683 0.63298969 0.64742268 0.6371134 0.65531476
0.65221878 0.63570691 0.625387 0.64086687]
mean value: 0.6409785835115381
key: test_accuracy
value: [0.78632479 0.8034188 0.77714286 0.77428571 0.79428571 0.78285714
0.75142857 0.72857143 0.81142857 0.79142857]
mean value: 0.7801172161172161
key: train_accuracy
value: [0.85179308 0.85147572 0.85247462 0.8518401 0.85247462 0.85628173
0.85596447 0.84771574 0.84676396 0.85437817]
mean value: 0.8521162204569656
key: test_roc_auc
value: [0.72993827 0.72685185 0.70612669 0.70929964 0.72370293 0.72249005
0.67412764 0.64221763 0.75084175 0.70818029]
mean value: 0.7093776749209583
key: train_roc_auc
value: [0.79201748 0.79436997 0.79151776 0.79506789 0.79266303 0.80040131
0.79931141 0.78876505 0.78520839 0.79500971]
mean value: 0.7934331987816876
key: test_jcc
value: [0.45652174 0.45238095 0.41791045 0.42335766 0.44615385 0.44525547
0.36956522 0.32142857 0.49230769 0.42063492]
mean value: 0.42455165258750477
key: train_jcc
value: [0.56918819 0.57221207 0.56904541 0.57351598 0.57063712 0.58363971
0.58195212 0.5620438 0.55647383 0.575 ]
mean value: 0.57137082195307
MCC on Blind test: 0.0
Accuracy on Blind test: 0.8
Running classifier: 7
Model_name: Gaussian NB
Model func: GaussianNB()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', GaussianNB())])
key: fit_time
value: [0.02377439 0.02267599 0.02248406 0.02395535 0.02301288 0.02358985
0.0240171 0.02351475 0.02222157 0.02287412]
mean value: 0.02321200370788574
key: score_time
value: [0.01044512 0.01016331 0.0115087 0.01030588 0.01141715 0.01150131
0.01128292 0.01144981 0.01067376 0.01150012]
mean value: 0.011024808883666993
key: test_mcc
value: [0.44633354 0.31146417 0.31683898 0.21661725 0.29441974 0.23320414
0.28373847 0.23525978 0.3506358 0.25424668]
mean value: 0.2942758539296778
key: train_mcc
value: [0.29243865 0.30042651 0.29832024 0.29668847 0.29951821 0.29921369
0.29991137 0.30829504 0.28742328 0.30896002]
mean value: 0.2991195464323728
key: test_fscore
value: [0.63070539 0.55555556 0.55762082 0.4962406 0.54054054 0.49799197
0.53333333 0.5093633 0.57692308 0.512 ]
mean value: 0.5410274583642639
key: train_fscore
value: [0.54104636 0.54443486 0.54188482 0.54256234 0.54459691 0.54388985
0.54537708 0.54814815 0.53910849 0.55006446]
mean value: 0.5441113314514708
key: test_precision
value: [0.57142857 0.46296296 0.46296296 0.41509434 0.46052632 0.43971631
0.46258503 0.42767296 0.49342105 0.45070423]
mean value: 0.464707473279549
key: train_precision
value: [0.4602026 0.46775389 0.46974281 0.46533923 0.46622614 0.46642066
0.46444122 0.47435897 0.45493258 0.4712813 ]
mean value: 0.46606994117236733
key: test_recall
value: [0.7037037 0.69444444 0.70093458 0.61682243 0.65420561 0.57407407
0.62962963 0.62962963 0.69444444 0.59259259]
mean value: 0.6490481135340949
key: train_recall
value: [0.65634675 0.65118679 0.64020619 0.65051546 0.65463918 0.65221878
0.66047472 0.64912281 0.66150671 0.66047472]
mean value: 0.6536692094092114
key: test_accuracy
value: [0.74643875 0.65811966 0.66 0.61714286 0.66 0.64285714
0.66 0.62571429 0.68571429 0.65142857]
mean value: 0.6607415547415548
key: train_accuracy
value: [0.65756903 0.6648683 0.66687817 0.66243655 0.66307107 0.66370558
0.66148477 0.67100254 0.65228426 0.66782995]
mean value: 0.6631130214886258
key: test_roc_auc
value: [0.7345679 0.66820988 0.67145494 0.61705319 0.65837852 0.6238139
0.65159167 0.62679829 0.68813131 0.63513927]
mean value: 0.6575138877366763
key: train_roc_auc
value: [0.65722929 0.66106544 0.65947065 0.65912574 0.6607293 0.66051159
0.66120392 0.66491871 0.65484864 0.66578477]
mean value: 0.6604888041198859
key: test_jcc
value: [0.46060606 0.38461538 0.38659794 0.33 0.37037037 0.3315508
0.36363636 0.34170854 0.40540541 0.34408602]
mean value: 0.37185768891358967
key: train_jcc
value: [0.37084548 0.37403675 0.37163375 0.37227139 0.37418975 0.37352246
0.37492677 0.37755102 0.36902706 0.37937167]
mean value: 0.37373760929429617
MCC on Blind test: 0.05
Accuracy on Blind test: 0.29
Running classifier: 8
Model_name: Gaussian Process
Model func: GaussianProcessClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', GaussianProcessClassifier(random_state=42))])
key: fit_time
value: [4.38879108 4.38229322 4.5046196 4.15736365 4.09703493 4.05187964
4.0418973 4.11157584 4.15766549 4.17857361]
mean value: 4.207169437408448
key: score_time
value: [0.10825753 0.11075377 0.11338806 0.10793972 0.1079073 0.10783029
0.11169577 0.10889673 0.10866928 0.10859442]
mean value: 0.10939328670501709
key: test_mcc
value: [0.35158767 0.39472568 0.33478851 0.19728965 0.24786491 0.3289988
0.33112964 0.20454545 0.39004482 0.2992864 ]
mean value: 0.30802615348396767
key: train_mcc
value: [0.60393158 0.60349676 0.60592537 0.60839298 0.59372402 0.59980133
0.61203425 0.61545145 0.60339391 0.59375723]
mean value: 0.6039908877307714
key: test_fscore
value: [0.46987952 0.50887574 0.47058824 0.34782609 0.39759036 0.46153846
0.44444444 0.34177215 0.47133758 0.4 ]
mean value: 0.43138525789131565
key: train_fscore
value: [0.66492489 0.66623292 0.67054264 0.67311412 0.66062176 0.66795367
0.67401167 0.67702265 0.66666667 0.65885417]
mean value: 0.6679945144805616
key: test_precision
value: [0.67241379 0.70491803 0.63492063 0.51851852 0.55932203 0.63934426
0.66666667 0.54 0.75510204 0.65957447]
mean value: 0.6350780451090973
key: train_precision
value: [0.90569395 0.90140845 0.89792388 0.89845095 0.88850174 0.88717949
0.90592334 0.90798611 0.9 0.89241623]
mean value: 0.8985484134106576
key: test_recall
value: [0.36111111 0.39814815 0.37383178 0.26168224 0.30841121 0.36111111
0.33333333 0.25 0.34259259 0.28703704]
mean value: 0.3277258566978193
key: train_recall
value: [0.5252838 0.52837977 0.53505155 0.53814433 0.5257732 0.53560372
0.53663571 0.53973168 0.52941176 0.52218782]
mean value: 0.5316203334290852
key: test_accuracy
value: [0.74928775 0.76353276 0.74285714 0.7 0.71428571 0.74
0.74285714 0.70285714 0.76285714 0.73428571]
mean value: 0.7352820512820514
key: train_accuracy
value: [0.83719454 0.83719454 0.83819797 0.83914975 0.83375635 0.83629442
0.84041878 0.84168782 0.83724619 0.83375635]
mean value: 0.8374896697044045
key: test_roc_auc
value: [0.64146091 0.66203704 0.63959078 0.57734318 0.60070767 0.63510101
0.62947658 0.57747934 0.64650291 0.61046067]
mean value: 0.6220160079666358
key: train_roc_auc
value: [0.75049708 0.75135762 0.75400607 0.75555246 0.74822115 0.75268505
0.75594955 0.75772658 0.75165045 0.74712231]
mean value: 0.7524768326814152
key: test_jcc
value: [0.30708661 0.34126984 0.30769231 0.21052632 0.2481203 0.3
0.28571429 0.20610687 0.30833333 0.25 ]
mean value: 0.27648498689533574
key: train_jcc
value: [0.49804305 0.4995122 0.50437318 0.50728863 0.49323017 0.50144928
0.5083089 0.51174168 0.5 0.49126214]
mean value: 0.5015209219285816
MCC on Blind test: -0.03
Accuracy on Blind test: 0.69
Running classifier: 9
Model_name: K-Nearest Neighbors
Model func: KNeighborsClassifier()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', KNeighborsClassifier())])
key: fit_time
value: [0.02363658 0.01844215 0.0195322 0.02016759 0.01816201 0.01733851
0.01837373 0.01851177 0.01979208 0.01869726]
mean value: 0.019265389442443846
key: score_time
value: [0.04622602 0.03083086 0.03058767 0.02885675 0.027174 0.03360391
0.03074574 0.03027773 0.02961588 0.04056311]
mean value: 0.032848167419433597
key: test_mcc
value: [0.23836565 0.23403006 0.28054316 0.16578307 0.20150758 0.26290302
0.21560049 0.16182617 0.34210538 0.24722671]
mean value: 0.23498912807312805
key: train_mcc
value: [0.46314017 0.51336554 0.47477518 0.51026006 0.49658416 0.50338467
0.52206567 0.50885521 0.50128134 0.49077305]
mean value: 0.49844850465229423
key: test_fscore
value: [0.40462428 0.42391304 0.47120419 0.35632184 0.40217391 0.43575419
0.3908046 0.33532934 0.48275862 0.4137931 ]
mean value: 0.4116677114641102
key: train_fscore
value: [0.58292079 0.62393162 0.59744991 0.62102689 0.60632362 0.61993958
0.63138686 0.62257282 0.62208955 0.60768761]
mean value: 0.6135329260909395
key: test_precision
value: [0.53846154 0.51315789 0.53571429 0.46268657 0.48051948 0.54929577
0.51515152 0.47457627 0.63636364 0.54545455]
mean value: 0.5251381509400351
key: train_precision
value: [0.72797527 0.76382661 0.7267356 0.76276276 0.76049767 0.74781341
0.76888889 0.75552283 0.73796034 0.74328358]
mean value: 0.7495266955218653
key: test_recall
value: [0.32407407 0.36111111 0.42056075 0.28971963 0.34579439 0.36111111
0.31481481 0.25925926 0.38888889 0.33333333]
mean value: 0.3398667358947733
key: train_recall
value: [0.48606811 0.52734778 0.50721649 0.52371134 0.50412371 0.52941176
0.53560372 0.52941176 0.5376677 0.51393189]
mean value: 0.5194494270849956
key: test_accuracy
value: [0.70655271 0.6980057 0.71142857 0.68 0.68571429 0.71142857
0.69714286 0.68285714 0.74285714 0.70857143]
mean value: 0.7024558404558404
key: train_accuracy
value: [0.78609965 0.80450651 0.78965736 0.80329949 0.79854061 0.80044416
0.80774112 0.80266497 0.79917513 0.79600254]
mean value: 0.7988131537486286
key: test_roc_auc
value: [0.60030864 0.60442387 0.63003346 0.57078574 0.59059267 0.61443985
0.5912917 0.56558004 0.64485767 0.6046832 ]
mean value: 0.6016996843096626
key: train_roc_auc
value: [0.70270408 0.72746857 0.71121595 0.72565035 0.71677313 0.72508151
0.73207121 0.72668481 0.72646097 0.71757062]
mean value: 0.7211681207055062
key: test_jcc
value: [0.25362319 0.26896552 0.30821918 0.21678322 0.25170068 0.27857143
0.24285714 0.20143885 0.31818182 0.26086957]
mean value: 0.26012105845333383
key: train_jcc
value: [0.41135371 0.45341615 0.42597403 0.45035461 0.43505338 0.44921191
0.46133333 0.45198238 0.45147314 0.43645925]
mean value: 0.4426611881854671
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
MCC on Blind test: 0.01
Accuracy on Blind test: 0.7
Running classifier: 10
Model_name: LDA
Model func: LinearDiscriminantAnalysis()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', LinearDiscriminantAnalysis())])
key: fit_time
value: [0.13375473 0.16326976 0.12306356 0.13328266 0.12028813 0.11996245
0.12084413 0.12088394 0.12044263 0.12117863]
mean value: 0.12769706249237062
key: score_time
value: [0.0291698 0.02833986 0.01341963 0.01336265 0.01336145 0.01333642
0.01360631 0.01339364 0.01336622 0.01346111]
mean value: 0.016481709480285645
key: test_mcc
value: [0.4073189 0.42020753 0.46315417 0.32458565 0.35306878 0.4136644
0.34013984 0.30222634 0.52255806 0.4500904 ]
mean value: 0.3997014055515561
key: train_mcc
value: [0.47056792 0.46534121 0.44997076 0.47225123 0.45361107 0.47403631
0.47673545 0.4773211 0.45202757 0.45300887]
mean value: 0.4644871484105123
key: test_fscore
value: [0.5625 0.5786802 0.61306533 0.51020408 0.53061224 0.57575758
0.51308901 0.47567568 0.64583333 0.57923497]
mean value: 0.5584652418889247
key: train_fscore
value: [0.61123853 0.60440835 0.59267868 0.61433447 0.59594203 0.61156069
0.61396422 0.61529615 0.59525188 0.59500291]
mean value: 0.6049677911847995
key: test_precision
value: [0.64285714 0.64044944 0.66304348 0.56179775 0.58426966 0.63333333
0.59036145 0.57142857 0.73809524 0.70666667]
mean value: 0.633230273035754
key: train_precision
value: [0.68774194 0.69006623 0.67909454 0.68527919 0.6807947 0.69513798
0.69633508 0.69480519 0.67810026 0.68085106]
mean value: 0.6868206168434133
key: test_recall
value: [0.5 0.52777778 0.57009346 0.46728972 0.48598131 0.52777778
0.4537037 0.40740741 0.57407407 0.49074074]
mean value: 0.500484596746279
key: train_recall
value: [0.5500516 0.5376677 0.5257732 0.55670103 0.52989691 0.54592363
0.54901961 0.55211558 0.53044376 0.52837977]
mean value: 0.5405972785207409
key: test_accuracy
value: [0.76068376 0.76353276 0.78 0.72571429 0.73714286 0.76
0.73428571 0.72285714 0.80571429 0.78 ]
mean value: 0.756993080993081
key: train_accuracy
value: [0.78483021 0.78356077 0.77760152 0.78489848 0.77887056 0.78680203
0.78775381 0.78775381 0.77823604 0.77887056]
mean value: 0.7829177789018715
key: test_roc_auc
value: [0.6882716 0.69804527 0.72126072 0.65339795 0.66685897 0.69570707
0.65660392 0.63552189 0.74158249 0.69991582]
mean value: 0.6857165697059967
key: train_roc_auc
value: [0.71957209 0.71521332 0.70766203 0.72152192 0.70972389 0.71982393
0.72137192 0.72223278 0.70933548 0.70921966]
mean value: 0.7155677023858551
key: test_jcc
value: [0.39130435 0.40714286 0.44202899 0.34246575 0.36111111 0.40425532
0.34507042 0.31205674 0.47692308 0.40769231]
mean value: 0.38900509189001437
key: train_jcc
value: [0.44013212 0.43308396 0.42113955 0.44334975 0.42444261 0.44046628
0.4429642 0.44435216 0.42374279 0.42349049]
mean value: 0.4337163904742951
MCC on Blind test: -0.04
Accuracy on Blind test: 0.79
Running classifier: 11
Model_name: Logistic Regression
Model func: LogisticRegression(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', LogisticRegression(random_state=42))])
key: fit_time
value: [0.06958032 0.0704143 0.07084465 0.07698393 0.07380819 0.06988978
0.07012153 0.08230162 0.07992363 0.07867622]
mean value: 0.07425441741943359
key: score_time
value: [0.01726627 0.01724505 0.01621366 0.01609278 0.01617527 0.01618195
0.01781774 0.01625228 0.01623321 0.01635242]
mean value: 0.016583061218261717
key: test_mcc
value: [0.38825158 0.457052 0.44083752 0.34447943 0.39432843 0.4103462
0.34456681 0.25411803 0.4760101 0.38954472]
mean value: 0.389953481278362
key: train_mcc
value: [0.44746336 0.43584538 0.43437514 0.44773138 0.43427748 0.44784801
0.44029554 0.45181596 0.43221637 0.43835351]
mean value: 0.44102221178688117
key: test_fscore
value: [0.54255319 0.59685864 0.5959596 0.51578947 0.55026455 0.57711443
0.50273224 0.43715847 0.60638298 0.51724138]
mean value: 0.5442054946418133
key: train_fscore
value: [0.58740436 0.57584771 0.57582938 0.59008746 0.57699115 0.58677686
0.58235294 0.59478261 0.57245081 0.57904085]
mean value: 0.582156413004612
key: test_precision
value: [0.6375 0.68674699 0.64835165 0.59036145 0.63414634 0.62365591
0.61333333 0.53333333 0.7125 0.68181818]
mean value: 0.6361747186013347
key: train_precision
value: [0.68356164 0.67977528 0.67688022 0.67919463 0.67448276 0.68551724
0.67715458 0.67857143 0.6779661 0.67916667]
mean value: 0.679227055814455
key: test_recall
value: [0.47222222 0.52777778 0.55140187 0.45794393 0.48598131 0.53703704
0.42592593 0.37037037 0.52777778 0.41666667]
mean value: 0.4773104880581516
key: train_recall
value: [0.51496388 0.499484 0.50103093 0.52164948 0.50412371 0.5128999
0.51083591 0.52941176 0.49535604 0.50464396]
mean value: 0.5094399582947666
key: test_accuracy
value: [0.75498575 0.78062678 0.77142857 0.73714286 0.75714286 0.75714286
0.74 0.70571429 0.78857143 0.76 ]
mean value: 0.7552755392755393
key: train_accuracy
value: [0.77753094 0.77372263 0.77284264 0.77696701 0.77252538 0.77791878
0.77474619 0.77823604 0.77252538 0.77442893]
mean value: 0.7751443925625094
key: test_roc_auc
value: [0.67644033 0.71039095 0.70985731 0.65901311 0.68126226 0.69620447
0.65304561 0.61287114 0.71636823 0.6649449 ]
mean value: 0.6780398306358846
key: train_roc_auc
value: [0.70454885 0.69749636 0.69735323 0.70605847 0.69798303 0.70422824
0.7013639 0.70904853 0.69545631 0.69941314]
mean value: 0.7012950049802817
key: test_jcc
value: [0.37226277 0.42537313 0.42446043 0.34751773 0.37956204 0.40559441
0.33576642 0.27972028 0.4351145 0.34883721]
mean value: 0.3754208935789206
key: train_jcc
value: [0.41583333 0.40434419 0.40432612 0.41852771 0.40547264 0.41520468
0.41078838 0.42326733 0.40100251 0.4075 ]
mean value: 0.41062668890491894
MCC on Blind test: -0.01
Accuracy on Blind test: 0.78
Running classifier: 12
Model_name: Logistic RegressionCV
Model func: LogisticRegressionCV(cv=3, random_state=42)
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', LogisticRegressionCV(cv=3, random_state=42))])
key: fit_time
value: [1.02331614 0.88047814 1.1186409 0.89670253 1.0611279 0.89869523
0.92736363 1.16069984 0.90899348 1.01710081]
mean value: 0.9893118619918824
key: score_time
value: [0.01662946 0.01335859 0.01498771 0.0132885 0.0133636 0.01334023
0.01333547 0.01936579 0.01324677 0.01580977]
mean value: 0.014672589302062989
key: test_mcc
value: [0.38604402 0.45197104 0.37466737 0.38479154 0.37024178 0.38988717
0.30806222 0.25165776 0.44313744 0.36493614]
mean value: 0.3725396486852549
key: train_mcc
value: [0.43617394 0.38221488 0.38554519 0.39442064 0.39021076 0.38871891
0.39649497 0.44589667 0.38157851 0.42840628]
mean value: 0.40296607216120756
key: test_fscore
value: [0.53763441 0.55294118 0.51685393 0.51162791 0.51933702 0.54736842
0.45348837 0.44086022 0.5443787 0.49710983]
mean value: 0.5121599974222204
key: train_fscore
value: [0.575179 0.50965251 0.51252408 0.52644836 0.51666667 0.51509313
0.52538071 0.58622719 0.50711514 0.5702381 ]
mean value: 0.5344524883439817
key: test_precision
value: [0.64102564 0.75806452 0.64788732 0.67692308 0.63513514 0.63414634
0.609375 0.52564103 0.75409836 0.66153846]
mean value: 0.6543834882455187
key: train_precision
value: [0.68175389 0.67692308 0.67972743 0.6763754 0.68305085 0.68197279
0.68204283 0.68219178 0.67937608 0.67369902]
mean value: 0.6797113148389633
key: test_recall
value: [0.46296296 0.43518519 0.42990654 0.41121495 0.43925234 0.48148148
0.36111111 0.37962963 0.42592593 0.39814815]
mean value: 0.4224818276220145
key: train_recall
value: [0.49742002 0.40866873 0.41134021 0.43092784 0.41546392 0.41382869
0.42724458 0.51393189 0.40454076 0.49432405]
mean value: 0.4417690679093124
key: test_accuracy
value: [0.75498575 0.78347578 0.75428571 0.76 0.75142857 0.75428571
0.73142857 0.70285714 0.78 0.75142857]
mean value: 0.7524175824175824
key: train_accuracy
value: [0.77403999 0.75817201 0.75920051 0.76142132 0.7607868 0.76046954
0.76269036 0.77696701 0.75824873 0.77093909]
mean value: 0.7642935346445492
key: test_roc_auc
value: [0.67386831 0.6867284 0.66351294 0.6623976 0.66407061 0.67875727
0.62890266 0.61336853 0.68197123 0.65361953]
mean value: 0.6607197084918222
key: train_roc_auc
value: [0.69715181 0.66102547 0.66259036 0.6696344 0.66488136 0.66408338
0.66941707 0.70382806 0.6598975 0.69402414]
mean value: 0.6746533554412456
key: test_jcc
value: [0.36764706 0.38211382 0.34848485 0.34375 0.35074627 0.37681159
0.29323308 0.28275862 0.37398374 0.33076923]
mean value: 0.34502982653092557
key: train_jcc
value: [0.40368509 0.34196891 0.34455959 0.35726496 0.34831461 0.34688581
0.35628227 0.41465445 0.33968804 0.3988343 ]
mean value: 0.365213803959852
MCC on Blind test: -0.02
Accuracy on Blind test: 0.74
Running classifier: 13
Model_name: MLP
Model func: MLPClassifier(max_iter=500, random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', MLPClassifier(max_iter=500, random_state=42))])
key: fit_time
value: [10.84695458 3.25894451 7.42751551 5.65094352 5.22740793 9.49621797
6.70749855 8.2631793 6.39622879 7.25859618]
mean value: 7.053348684310913
key: score_time
value: [0.01381111 0.01665616 0.01378012 0.01453614 0.01382446 0.01394248
0.01390457 0.01394033 0.01392841 0.01393223]
mean value: 0.014225602149963379
key: test_mcc
value: [0.34308379 0.43666303 0.45335713 0.28566657 0.39069163 0.41502811
0.34083513 0.28251356 0.42434982 0.35148312]
mean value: 0.3723671885987604
key: train_mcc
value: [0.66962737 0.50975128 0.59875976 0.58809296 0.52024596 0.63194776
0.5960432 0.63370511 0.589063 0.61699468]
mean value: 0.5954231072613336
key: test_fscore
value: [0.54028436 0.59296482 0.61538462 0.52320675 0.55208333 0.56994819
0.54954955 0.49019608 0.58706468 0.5257732 ]
mean value: 0.5546455571085601
key: train_fscore
value: [0.7640327 0.64623955 0.71032186 0.72054528 0.64803195 0.72303207
0.726127 0.74408828 0.70729053 0.72903226]
mean value: 0.711874148160358
key: test_precision
value: [0.55339806 0.64835165 0.63366337 0.47692308 0.62352941 0.64705882
0.53508772 0.52083333 0.6344086 0.59302326]
mean value: 0.5866277295753974
key: train_precision
value: [0.80946882 0.70217918 0.76923077 0.68265683 0.72541507 0.8310992
0.68464351 0.75802998 0.74798619 0.76094276]
mean value: 0.7471652301286992
key: test_recall
value: [0.52777778 0.5462963 0.59813084 0.57943925 0.4953271 0.50925926
0.56481481 0.46296296 0.5462963 0.47222222]
mean value: 0.5302526825891312
key: train_recall
value: [0.72342621 0.59855521 0.65979381 0.7628866 0.58556701 0.63983488
0.77296182 0.73065015 0.67079463 0.6996904 ]
mean value: 0.6844160735373911
key: test_accuracy
value: [0.72364672 0.76923077 0.77142857 0.67714286 0.75428571 0.76285714
0.71428571 0.70285714 0.76285714 0.73714286]
mean value: 0.7375734635734637
key: train_accuracy
value: [0.86258331 0.79847667 0.83439086 0.8178934 0.80425127 0.84930203
0.82074873 0.84549492 0.82931472 0.84010152]
mean value: 0.8302557442887359
key: test_roc_auc
value: [0.66923868 0.70730453 0.72293373 0.64980193 0.68181993 0.69264616
0.67290328 0.63644016 0.70290021 0.66379706]
mean value: 0.6799785672579027
key: train_roc_auc
value: [0.82390376 0.7429073 0.78590057 0.80261653 0.74351678 0.79105807
0.80746121 0.81356145 0.78523699 0.80105913]
mean value: 0.789722179257127
key: test_jcc
value: [0.37012987 0.42142857 0.44444444 0.35428571 0.38129496 0.39855072
0.37888199 0.32467532 0.41549296 0.35664336]
mean value: 0.38458279155978586
key: train_jcc
value: [0.61816578 0.47736626 0.55077453 0.56316591 0.47932489 0.56621005
0.57001522 0.59246862 0.54713805 0.57360406]
mean value: 0.5538233360461919
MCC on Blind test: -0.04
Accuracy on Blind test: 0.82
Running classifier: 14
Model_name: Multinomial
Model func: MultinomialNB()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', MultinomialNB())])
key: fit_time
value: [0.02773094 0.02594233 0.0253613 0.02532125 0.02529359 0.02716494
0.02527809 0.02536917 0.02622843 0.02503729]
mean value: 0.02587273120880127
key: score_time
value: [0.01334834 0.01335287 0.01334095 0.01341772 0.01333547 0.01338434
0.01331067 0.0133884 0.01366425 0.01333523]
mean value: 0.013387823104858398
key: test_mcc
value: [0.43820345 0.2736916 0.22178411 0.15253962 0.32610051 0.19981054
0.32283817 0.21258049 0.30315709 0.16945312]
mean value: 0.26201587000897486
key: train_mcc
value: [0.25804861 0.26194065 0.26693342 0.27356999 0.26866249 0.27684651
0.26564424 0.27660615 0.26252082 0.27609091]
mean value: 0.26868637885911506
key: test_fscore
value: [0.60194175 0.51082251 0.47368421 0.43103448 0.54385965 0.45535714
0.53811659 0.46696035 0.53043478 0.42922374]
mean value: 0.4981435214912304
key: train_fscore
value: [0.49064239 0.4952381 0.50024863 0.51106833 0.50222883 0.51187591
0.50074074 0.50719603 0.49950836 0.50694444]
mean value: 0.5025691758051845
key: test_precision
value: [0.63265306 0.4796748 0.44628099 0.4 0.51239669 0.43965517
0.52173913 0.44537815 0.5 0.42342342]
mean value: 0.48012014214553733
key: train_precision
value: [0.48115079 0.48148148 0.48318924 0.47924188 0.48331745 0.48263254
0.48011364 0.48852772 0.47699531 0.48806113]
mean value: 0.4824711173045208
key: test_recall
value: [0.57407407 0.5462963 0.5046729 0.46728972 0.57943925 0.47222222
0.55555556 0.49074074 0.56481481 0.43518519]
mean value: 0.5190290758047768
key: train_recall
value: [0.500516 0.50980392 0.5185567 0.54742268 0.52268041 0.54489164
0.52321981 0.52734778 0.52425181 0.52734778]
mean value: 0.5246038534784505
key: test_accuracy
value: [0.76638177 0.67806268 0.65714286 0.62285714 0.70285714 0.65142857
0.70571429 0.65428571 0.69142857 0.64285714]
mean value: 0.6773015873015874
key: train_accuracy
value: [0.68041891 0.68041891 0.68115482 0.67766497 0.68115482 0.6805203
0.67925127 0.68496193 0.67703046 0.68464467]
mean value: 0.6807221077991517
key: test_roc_auc
value: [0.71296296 0.64146091 0.61447637 0.57932387 0.66832045 0.60181359
0.66414141 0.60900673 0.65637435 0.58536119]
mean value: 0.6333241831766802
key: train_roc_auc
value: [0.63041382 0.63299545 0.63599696 0.64149319 0.63714222 0.64280771
0.63586552 0.6411361 0.63454917 0.64090706]
mean value: 0.637330719766589
key: test_jcc
value: [0.43055556 0.34302326 0.31034483 0.27472527 0.37349398 0.29479769
0.36809816 0.3045977 0.36094675 0.27325581]
mean value: 0.3333838997620123
key: train_jcc
value: [0.32506702 0.32911392 0.33355438 0.34324499 0.33531746 0.34397394
0.33399209 0.33976064 0.33289646 0.33953488]
mean value: 0.33564557950437873
MCC on Blind test: 0.03
Accuracy on Blind test: 0.47
Running classifier: 15
Model_name: Naive Bayes
Model func: BernoulliNB()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', BernoulliNB())])
key: fit_time
value: [0.02892113 0.028929 0.02833509 0.02840018 0.02841353 0.02855992
0.02845263 0.02836442 0.02919459 0.02835846]
mean value: 0.028592896461486817
key: score_time
value: [0.01389027 0.01379681 0.01381564 0.01380849 0.01379967 0.01385236
0.01376271 0.01376081 0.01382709 0.01379967]
mean value: 0.013811349868774414
key: test_mcc
value: [0.32630607 0.23736411 0.13985473 0.09660755 0.25563123 0.15964405
0.26662299 0.15964405 0.29651086 0.13351964]
mean value: 0.20717052798573712
key: train_mcc
value: [0.21468889 0.22762278 0.23399604 0.22800146 0.22483841 0.2331307
0.23014473 0.24267498 0.21879169 0.24168225]
mean value: 0.2295571917861563
key: test_fscore
value: [0.5026178 0.44329897 0.37305699 0.33684211 0.44324324 0.38974359
0.42105263 0.38974359 0.48167539 0.34972678]
mean value: 0.41310010931369073
key: train_fscore
value: [0.41661721 0.425 0.43211334 0.43548387 0.42832066 0.44100802
0.43058824 0.43679525 0.42388759 0.44276583]
mean value: 0.4312580005695298
key: test_precision
value: [0.57831325 0.5 0.41860465 0.38554217 0.52564103 0.43678161
0.57142857 0.43678161 0.55421687 0.42666667]
mean value: 0.48339764224464854
key: train_precision
value: [0.49022346 0.5021097 0.50552486 0.49347258 0.49526387 0.4954955
0.50068399 0.51396648 0.48985115 0.50664894]
mean value: 0.499324054200173
key: test_recall
value: [0.44444444 0.39814815 0.3364486 0.29906542 0.38317757 0.35185185
0.33333333 0.35185185 0.42592593 0.2962963 ]
mean value: 0.36205434406368986
key: train_recall
value: [0.3622291 0.36842105 0.37731959 0.38969072 0.37731959 0.39731682
0.37770898 0.37977296 0.37358101 0.39318885]
mean value: 0.37965486791569586
key: test_accuracy
value: [0.72934473 0.69230769 0.65428571 0.64 0.70571429 0.66
0.71714286 0.66 0.71714286 0.66 ]
mean value: 0.6835938135938135
key: train_accuracy
value: [0.68803554 0.69343066 0.69479695 0.68908629 0.69003807 0.69035533
0.6928934 0.69892132 0.68781726 0.69574873]
mean value: 0.6921123561612058
key: test_roc_auc
value: [0.65020576 0.61059671 0.56534364 0.54459444 0.6154571 0.57468626
0.61088154 0.57468626 0.63651668 0.55930517]
mean value: 0.594227355686574
key: train_roc_auc
value: [0.59747569 0.60309229 0.60662496 0.6059361 0.60318775 0.60887371
0.60525394 0.61017966 0.60044144 0.61161962]
mean value: 0.6052685162649682
key: test_jcc
value: [0.33566434 0.28476821 0.22929936 0.20253165 0.28472222 0.24203822
0.26666667 0.24203822 0.31724138 0.21192053]
mean value: 0.26168907873333885
key: train_jcc
value: [0.26311844 0.26984127 0.27560241 0.27835052 0.2725242 0.28288024
0.27436282 0.27942293 0.26894502 0.28432836]
mean value: 0.2749376200389316
MCC on Blind test: 0.04
Accuracy on Blind test: 0.29
Running classifier: 16
Model_name: Passive Aggresive
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model',
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
key: fit_time
value: [0.0742321 0.05576539 0.03834748 0.04215097 0.0556159 0.05554962
0.04225445 0.05216122 0.06737113 0.04966068]
mean value: 0.05331089496612549
key: score_time
value: [0.01249552 0.01328373 0.01329446 0.01324511 0.01326609 0.01328063
0.01333904 0.01542926 0.01162767 0.01318765]
mean value: 0.013244915008544921
key: test_mcc
value: [0.305824 0.1392715 0.34847115 0.14677247 0.16481024 0.33368834
0.08686835 0.3510101 0.11348048 0.26724869]
mean value: 0.22574453408589684
key: train_mcc
value: [0.38306402 0.11879482 0.36713585 0.1674599 0.18958939 0.33899987
0.19960057 0.43928717 0.12624199 0.32659243]
mean value: 0.26567660111029545
key: test_fscore
value: [0.55516014 0.05405405 0.57861635 0.11965812 0.15 0.40268456
0.08547009 0.57258065 0.03636364 0.36 ]
mean value: 0.2914587599015587
key: train_fscore
value: [0.5997648 0.05205205 0.59082701 0.11143132 0.12952381 0.39665653
0.13688213 0.62910382 0.05976096 0.44398907]
mean value: 0.3149991495983481
key: test_precision
value: [0.45086705 1. 0.43601896 0.7 0.69230769 0.73170732
0.55555556 0.50714286 1. 0.64285714]
mean value: 0.6716456574305512
key: train_precision
value: [0.48356511 0.86666667 0.45469705 0.81690141 0.85 0.75216138
0.86746988 0.55175097 0.85714286 0.65656566]
mean value: 0.7156920985769661
key: test_recall
value: [0.72222222 0.02777778 0.85981308 0.06542056 0.08411215 0.27777778
0.0462963 0.65740741 0.01851852 0.25 ]
mean value: 0.30093457943925234
key: train_recall
value: [0.78947368 0.02683179 0.84329897 0.05979381 0.07010309 0.26934985
0.07430341 0.73168215 0.03095975 0.33539732]
mean value: 0.32311938123051714
key: test_accuracy
value: [0.64387464 0.7008547 0.61714286 0.70571429 0.70857143 0.74571429
0.69428571 0.69714286 0.69714286 0.72571429]
mean value: 0.6936157916157917
key: train_accuracy
value: [0.67597588 0.69946049 0.64054569 0.70653553 0.71002538 0.74809645
0.71192893 0.73477157 0.70050761 0.74175127]
mean value: 0.7069598805954761
key: test_roc_auc
value: [0.66563786 0.51388889 0.68505057 0.52653744 0.53382562 0.61616162
0.51488369 0.68614172 0.50925926 0.59400826]
mean value: 0.5845394932362741
key: train_roc_auc
value: [0.70752328 0.5124993 0.69685572 0.52691799 0.53230178 0.61497726
0.53463223 0.73391253 0.51433466 0.62876142]
mean value: 0.600271616743876
key: test_jcc
value: [0.38423645 0.02777778 0.40707965 0.06363636 0.08108108 0.25210084
0.04464286 0.40112994 0.01851852 0.2195122 ]
mean value: 0.18997156763371786
key: train_jcc
value: [0.42833147 0.02672148 0.41927217 0.05900305 0.06924644 0.24739336
0.07346939 0.45889968 0.03080082 0.28533802]
mean value: 0.20984758689882854
MCC on Blind test: -0.03
Accuracy on Blind test: 0.8
Running classifier: 17
Model_name: QDA
Model func: QuadraticDiscriminantAnalysis()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', QuadraticDiscriminantAnalysis())])
key: fit_time
value: [0.06275249 0.07007003 0.07505083 0.07110786 0.06314135 0.06551933
0.06556749 0.06590605 0.06585693 0.06369591]
mean value: 0.0668668270111084
key: score_time
value: [0.0153718 0.0174222 0.01478672 0.01479149 0.01685905 0.01480484
0.01477504 0.01480556 0.01481795 0.01482773]
mean value: 0.015326237678527832
key: test_mcc
value: [0.10492353 0.11033958 0.07917453 0.10979452 0.14523641 0.06461843
0.065791 0.06461843 0.09849959 0.065791 ]
mean value: 0.09087870318951301
key: train_mcc
value: [0.13028192 0.13316984 0.13820049 0.12974332 0.12915673 0.14470913
0.13196911 0.13196911 0.13196911 0.13311737]
mean value: 0.13342861537511536
key: test_fscore
value: [0.48181818 0.48198198 0.47511312 0.47963801 0.48526077 0.47619048
0.47640449 0.47619048 0.48089888 0.47640449]
mean value: 0.4789900883546432
key: train_fscore
value: [0.48401598 0.48462116 0.48620171 0.48415273 0.48403194 0.48705705
0.48425787 0.48425787 0.48425787 0.4845 ]
mean value: 0.4847354176824523
key: test_precision
value: [0.31927711 0.31845238 0.31343284 0.31641791 0.32035928 0.31531532
0.31454006 0.31531532 0.31750742 0.31454006]
mean value: 0.3165157684814517
key: train_precision
value: [0.31927512 0.31980198 0.32128647 0.31939414 0.31928901 0.32192691
0.31948566 0.31948566 0.31948566 0.31969647]
mean value: 0.31991270741876254
key: test_recall
value: [0.98148148 0.99074074 0.98130841 0.99065421 1. 0.97222222
0.98148148 0.97222222 0.99074074 0.98148148]
mean value: 0.9842332987192799
key: train_recall
value: [1. 1. 0.99896907 1. 1. 1.
1. 1. 1. 1. ]
mean value: 0.9998969072164948
key: test_accuracy
value: [0.35042735 0.34472934 0.33714286 0.34285714 0.35142857 0.34
0.33428571 0.34 0.34 0.33428571]
mean value: 0.34151566951566953
key: train_accuracy
value: [0.34433513 0.34592193 0.35025381 0.34422589 0.34390863 0.35247462
0.34517766 0.34517766 0.34517766 0.34581218]
mean value: 0.34624651830778086
key: test_roc_auc
value: [0.52572016 0.52417695 0.51740318 0.52413369 0.53292181 0.51503673
0.51346801 0.51503673 0.52016376 0.51346801]
mean value: 0.5201529041635717
key: train_roc_auc
value: [0.52658112 0.52772686 0.53041946 0.52635197 0.52612282 0.53252405
0.52725607 0.52725607 0.52725607 0.52771415]
mean value: 0.5279208639467812
key: test_jcc
value: [0.31736527 0.31750742 0.3115727 0.31547619 0.32035928 0.3125
0.31268437 0.3125 0.31656805 0.31268437]
mean value: 0.3149217638969456
key: train_jcc
value: [0.31927512 0.31980198 0.32117998 0.31939414 0.31928901 0.32192691
0.31948566 0.31948566 0.31948566 0.31969647]
mean value: 0.31990205821517786
MCC on Blind test: 0.03
Accuracy on Blind test: 0.08
Running classifier: 18
Model_name: Random Forest
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model',
RandomForestClassifier(n_estimators=1000, random_state=42))])
key: fit_time
value: [9.69815874 9.55301309 9.65572405 9.90236759 9.78565121 9.65619826
9.46304631 9.53649521 9.56446266 9.48821688]
mean value: 9.630333399772644
key: score_time
value: [0.14259696 0.14086413 0.15150118 0.2105577 0.14180732 0.14131832
0.14111161 0.13978195 0.14225721 0.14205599]
mean value: 0.14938523769378662
key: test_mcc
value: [0.43428682 0.41039856 0.4423657 0.38646686 0.37110308 0.43181818
0.40134242 0.21486016 0.53332085 0.37932594]
mean value: 0.40052885645996145
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.57894737 0.54444444 0.59067358 0.51724138 0.53191489 0.57446809
0.53631285 0.37869822 0.64130435 0.50292398]
mean value: 0.5396929144477136
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.67073171 0.68055556 0.6627907 0.67164179 0.61728395 0.675
0.67605634 0.52459016 0.77631579 0.68253968]
mean value: 0.6637505676185069
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.50925926 0.4537037 0.53271028 0.42056075 0.46728972 0.5
0.44444444 0.2962963 0.5462963 0.39814815]
mean value: 0.45687088958116984
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.77207977 0.76638177 0.77428571 0.76 0.74857143 0.77142857
0.76285714 0.7 0.81142857 0.75714286]
mean value: 0.7624175824175825
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: [0.69907407 0.67952675 0.70668436 0.66501288 0.66985885 0.69628099
0.67470156 0.58823079 0.73802418 0.65775176]
mean value: 0.6775146203849121
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.40740741 0.3740458 0.41911765 0.34883721 0.36231884 0.40298507
0.36641221 0.23357664 0.472 0.3359375 ]
mean value: 0.3722638336578074
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: 0.02
Accuracy on Blind test: 0.71
Running classifier: 19
Model_name: Random Forest2
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
n_estimators=1000, n_jobs=10, oob_score=True,
random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model',
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
n_estimators=1000, n_jobs=10,
oob_score=True, random_state=42))])
key: fit_time
value: [2.0124507 2.09822416 2.05270982 2.09729695 2.1110568 2.09646678
2.07309604 2.0698843 2.11029196 2.07656288]
mean value: 2.079804039001465
key: score_time
value: [0.33807611 0.15001702 0.37944484 0.36809707 0.15305543 0.35096097
0.34709024 0.38295197 0.33342695 0.35730577]
mean value: 0.3160426378250122
key: test_mcc
value: [0.39888867 0.44685976 0.43825425 0.3638278 0.33735052 0.46789823
0.38783695 0.22469227 0.50165673 0.40187097]
mean value: 0.3969136153896103
key: train_mcc
value: [0.79422588 0.80081735 0.7946531 0.80079093 0.79247809 0.79716398
0.7928175 0.79772424 0.80104971 0.79328808]
mean value: 0.7965008879122801
key: test_fscore
value: [0.54347826 0.56497175 0.58201058 0.50285714 0.49162011 0.59893048
0.53038674 0.38823529 0.61111111 0.51190476]
mean value: 0.5325506237629998
key: train_fscore
value: [0.84069767 0.84708598 0.84246971 0.84761357 0.84027778 0.84356895
0.84131564 0.8445985 0.84480747 0.84039466]
mean value: 0.843282991652627
key: test_precision
value: [0.65789474 0.72463768 0.67073171 0.64705882 0.61111111 0.70886076
0.65753425 0.53225806 0.76388889 0.71666667]
mean value: 0.6690642686099819
key: train_precision
value: [0.96271638 0.96073298 0.95674967 0.95838752 0.95778364 0.96169089
0.95418848 0.95931759 0.97181208 0.9602122 ]
mean value: 0.9603591426395782
key: test_recall
value: [0.46296296 0.46296296 0.51401869 0.41121495 0.41121495 0.51851852
0.44444444 0.30555556 0.50925926 0.39814815]
mean value: 0.4438300449982693
key: train_recall
value: [0.74613003 0.75748194 0.75257732 0.75979381 0.74845361 0.75128999
0.75232198 0.75438596 0.74716202 0.74716202]
mean value: 0.7516758694796422
key: test_accuracy
value: [0.76068376 0.78062678 0.77428571 0.75142857 0.74 0.78571429
0.75714286 0.70285714 0.8 0.76571429]
mean value: 0.7618453398453399
key: train_accuracy
value: [0.91304348 0.91589971 0.91338832 0.9159264 0.91243655 0.9143401
0.91275381 0.91465736 0.91560914 0.91275381]
mean value: 0.914080867487076
key: test_roc_auc
value: [0.67798354 0.69238683 0.70145379 0.65622476 0.64799431 0.7117386
0.67056933 0.59286042 0.71950566 0.66395011]
mean value: 0.673466734909433
key: train_roc_auc
value: [0.86664888 0.87186654 0.86872679 0.87256418 0.86689408 0.86900276
0.8681445 0.87009266 0.86877112 0.86670973]
mean value: 0.8689421254006773
key: test_jcc
value: [0.37313433 0.39370079 0.41044776 0.33587786 0.32592593 0.42748092
0.36090226 0.24087591 0.44 0.344 ]
mean value: 0.36523457495535505
key: train_jcc
value: [0.72517553 0.73473473 0.72781655 0.73552894 0.7245509 0.72945892
0.72609562 0.731 0.73131313 0.72472472]
mean value: 0.7290399043386196
MCC on Blind test: 0.01
Accuracy on Blind test: 0.72
Running classifier: 20
Model_name: Ridge Classifier
Model func: RidgeClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', RidgeClassifier(random_state=42))])
key: fit_time
value: [0.05497599 0.04969144 0.03960133 0.03975677 0.04787111 0.03980231
0.03986311 0.04015255 0.03969288 0.04209948]
mean value: 0.0433506965637207
key: score_time
value: [0.02100301 0.02088165 0.02100801 0.02072668 0.0206759 0.02067447
0.0207541 0.0207262 0.02068782 0.02070904]
mean value: 0.020784687995910645
key: test_mcc
value: [0.41969837 0.44711188 0.4528962 0.36381274 0.34504255 0.40536264
0.32443164 0.28426762 0.47918482 0.38084245]
mean value: 0.3902650906999413
key: train_mcc
value: [0.44109835 0.44165084 0.44624177 0.45312996 0.44581364 0.45144848
0.45692914 0.45837482 0.42988385 0.44839858]
mean value: 0.4472969420433929
key: test_fscore
value: [0.56842105 0.58510638 0.59259259 0.51648352 0.50810811 0.56852792
0.48044693 0.45901639 0.59668508 0.50867052]
mean value: 0.5384058495497313
key: train_fscore
value: [0.57708458 0.57692308 0.57850242 0.58921162 0.58083832 0.58527828
0.59033989 0.59421146 0.56447689 0.58006042]
mean value: 0.5816926953776681
key: test_precision
value: [0.65853659 0.6875 0.68292683 0.62666667 0.6025641 0.62921348
0.6056338 0.56 0.73972603 0.67692308]
mean value: 0.6469690574148222
key: train_precision
value: [0.68911175 0.69064748 0.69825073 0.69316597 0.69285714 0.6965812
0.69915254 0.69475138 0.68740741 0.69970845]
mean value: 0.6941634053289556
key: test_recall
value: [0.5 0.50925926 0.52336449 0.43925234 0.43925234 0.51851852
0.39814815 0.38888889 0.5 0.40740741]
mean value: 0.46240913811007267
key: train_recall
value: [0.49638803 0.49535604 0.49381443 0.51237113 0.5 0.50464396
0.51083591 0.51909185 0.47884417 0.49535604]
mean value: 0.5006701562882343
key: test_accuracy
value: [0.76638177 0.77777778 0.78 0.74857143 0.74 0.75714286
0.73428571 0.71714286 0.79142857 0.75714286]
mean value: 0.756987382987383
key: train_accuracy
value: [0.7762615 0.77657886 0.7785533 0.78013959 0.77791878 0.78013959
0.78204315 0.78204315 0.77284264 0.77950508]
mean value: 0.7786025647324916
key: test_roc_auc
value: [0.69238683 0.7031893 0.7081843 0.662013 0.65584016 0.69107744
0.64122283 0.62626263 0.7107438 0.66031527]
mean value: 0.6751235569134181
key: train_roc_auc
value: [0.69846899 0.69841129 0.69947367 0.7057731 0.70073327 0.70353591
0.70663188 0.70892751 0.6910941 0.70049524]
mean value: 0.7013544961701687
key: test_jcc
value: [0.39705882 0.41353383 0.42105263 0.34814815 0.34057971 0.39716312
0.31617647 0.29787234 0.42519685 0.34108527]
mean value: 0.36978672012805747
key: train_jcc
value: [0.40556492 0.40540541 0.40696686 0.41764706 0.4092827 0.41370558
0.41878173 0.42268908 0.39322034 0.40851064]
mean value: 0.41017743162321824
MCC on Blind test: -0.03
Accuracy on Blind test: 0.8
Running classifier: 21
Model_name: Ridge ClassifierCV
Model func: RidgeClassifierCV(cv=3)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', RidgeClassifierCV(cv=3))])
key: fit_time
value: [0.21218586 0.22487879 0.25018668 0.24553132 0.23346281 0.22956419
0.27101207 0.20863605 0.21304107 0.20641422]
mean value: 0.22949130535125734
key: score_time
value: [0.02126575 0.02086926 0.02161551 0.02743173 0.02590466 0.02545929
0.02201629 0.02069378 0.0207119 0.02070785]
mean value: 0.022667598724365235
key: test_mcc
value: [0.40089186 0.46028325 0.4423657 0.37484632 0.37968648 0.42781714
0.34639249 0.25681816 0.47745274 0.36322552]
mean value: 0.39297796714179856
key: train_mcc
value: [0.42964081 0.42758647 0.42307922 0.43376872 0.43162556 0.4267374
0.43049523 0.4430004 0.42056117 0.42429786]
mean value: 0.42907928414801494
key: test_fscore
value: [0.5483871 0.59139785 0.59067358 0.52972973 0.53968254 0.58585859
0.49438202 0.43333333 0.58757062 0.49122807]
mean value: 0.5392243424086557
key: train_fscore
value: [0.56394641 0.56344869 0.5603396 0.57431629 0.5686747 0.56435045
0.56954436 0.57878969 0.55569155 0.55921856]
mean value: 0.5658320315895116
key: test_precision
value: [0.65384615 0.70512821 0.6627907 0.62820513 0.62195122 0.64444444
0.62857143 0.54166667 0.75362319 0.66666667]
mean value: 0.6506893799121104
key: train_precision
value: [0.68796434 0.68436578 0.68041237 0.67837079 0.68405797 0.68075802
0.6795422 0.69 0.68270677 0.68460389]
mean value: 0.6832782123112823
key: test_recall
value: [0.47222222 0.50925926 0.53271028 0.45794393 0.47663551 0.53703704
0.40740741 0.36111111 0.48148148 0.38888889]
mean value: 0.46246971270335757
key: train_recall
value: [0.47781218 0.47884417 0.47628866 0.49793814 0.48659794 0.48194014
0.49019608 0.49845201 0.46852425 0.47265222]
mean value: 0.48292457948996204
key: test_accuracy
value: [0.76068376 0.78347578 0.77428571 0.75142857 0.75142857 0.76571429
0.74285714 0.70857143 0.79142857 0.75142857]
mean value: 0.7581302401302403
key: train_accuracy
value: [0.77277055 0.77181847 0.76998731 0.77284264 0.77284264 0.77125635
0.77220812 0.77696701 0.76967005 0.77093909]
mean value: 0.7721302217328477
key: test_roc_auc
value: [0.68055556 0.70730453 0.70668436 0.66930118 0.67453175 0.70240282
0.6499847 0.61237374 0.70561677 0.65105601]
mean value: 0.6759811407444277
key: train_roc_auc
value: [0.6907851 0.6903845 0.68841931 0.69649428 0.6933448 0.69080974
0.6937925 0.69952376 0.68593414 0.68799812]
mean value: 0.6917486244881196
key: test_jcc
value: [0.37777778 0.41984733 0.41911765 0.36029412 0.36956522 0.41428571
0.32835821 0.27659574 0.416 0.3255814 ]
mean value: 0.37074231513898653
key: train_jcc
value: [0.39270568 0.39222316 0.38921651 0.4028357 0.3973064 0.39309764
0.39815591 0.40725126 0.38474576 0.38813559]
mean value: 0.3945673623428941
MCC on Blind test: -0.04
Accuracy on Blind test: 0.73
Running classifier: 22
Model_name: SVC
Model func: SVC(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', SVC(random_state=42))])
key: fit_time
value: [0.55239344 0.53404284 0.43111777 0.5280683 0.47399902 0.53223133
0.49989295 0.46138453 0.54418755 0.51415801]
mean value: 0.5071475744247437
key: score_time
value: [0.12435341 0.12515259 0.12607288 0.12950492 0.12039733 0.11679816
0.1313374 0.12869167 0.15721178 0.12583566]
mean value: 0.1285355806350708
key: test_mcc
value: [0.45155396 0.38482869 0.3563687 0.30700364 0.2880209 0.38596027
0.35869299 0.26662299 0.43424128 0.20780069]
mean value: 0.3441094118116766
key: train_mcc
value: [0.43300912 0.43882531 0.42388485 0.45809603 0.44031728 0.44696958
0.43502961 0.46384235 0.42512041 0.40648445]
mean value: 0.4371578994042662
key: test_fscore
value: [0.54761905 0.4939759 0.47904192 0.44311377 0.42424242 0.52513966
0.47272727 0.42105263 0.53012048 0.28368794]
mean value: 0.4620721058399496
key: train_fscore
value: [0.53901639 0.53907285 0.52021563 0.56630365 0.54064772 0.55208333
0.54153041 0.57051282 0.51962111 0.49619903]
mean value: 0.5385202953784166
key: test_precision
value: [0.76666667 0.70689655 0.66666667 0.61666667 0.60344828 0.66197183
0.68421053 0.57142857 0.75862069 0.60606061]
mean value: 0.6642637052032263
key: train_precision
value: [0.73920863 0.75231054 0.75097276 0.74788494 0.75322284 0.74779541
0.73928571 0.75296108 0.75442043 0.75104603]
mean value: 0.7489108377640666
key: test_recall
value: [0.42592593 0.37962963 0.37383178 0.34579439 0.3271028 0.43518519
0.36111111 0.33333333 0.40740741 0.18518519]
mean value: 0.3574506749740395
key: train_recall
value: [0.42414861 0.42002064 0.39793814 0.4556701 0.42164948 0.4375645
0.42724458 0.45923633 0.39628483 0.37048504]
mean value: 0.4210242252082602
key: test_accuracy
value: [0.78347578 0.76068376 0.75142857 0.73428571 0.72857143 0.75714286
0.75142857 0.71714286 0.77714286 0.71142857]
mean value: 0.7472730972730973
key: train_accuracy
value: [0.77689622 0.77911774 0.77411168 0.78521574 0.77950508 0.78172589
0.77760152 0.78743655 0.77474619 0.76871827]
mean value: 0.7785074877526593
key: test_roc_auc
value: [0.68415638 0.65483539 0.64576362 0.62557209 0.6162263 0.66800582
0.64336547 0.61088154 0.67477808 0.56573309]
mean value: 0.6389317790065927
key: train_roc_auc
value: [0.67884791 0.67930455 0.66963818 0.69369206 0.68011897 0.68602916
0.68018207 0.69617794 0.66951209 0.65798645]
mean value: 0.6791489375320268
key: test_jcc
value: [0.37704918 0.328 0.31496063 0.28461538 0.26923077 0.35606061
0.30952381 0.26666667 0.36065574 0.16528926]
mean value: 0.303205204024963
key: train_jcc
value: [0.36894075 0.36899365 0.35154827 0.39499553 0.37047101 0.38129496
0.37130045 0.39910314 0.35100548 0.32996324]
mean value: 0.3687616494737401
MCC on Blind test: -0.03
Accuracy on Blind test: 0.7
Running classifier: 23
Model_name: Stochastic GDescent
Model func: SGDClassifier(n_jobs=10, random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', SGDClassifier(n_jobs=10, random_state=42))])
key: fit_time
value: [0.08841491 0.11297154 0.11702013 0.12259078 0.12597108 0.10838795
0.12249351 0.10710287 0.12898302 0.10433817]
mean value: 0.11382739543914795
key: score_time
value: [0.01736617 0.01123977 0.01137972 0.01109433 0.01112485 0.01136565
0.01115441 0.01128054 0.01119924 0.01130056]
mean value: 0.011850523948669433
key: test_mcc
value: [0.21077681 0.35920458 0.48848121 0.22420323 0.16481024 0.32190153
0.34746116 0.1609488 0.3675434 0.10820346]
mean value: 0.27535344178612764
key: train_mcc
value: [0.34261114 0.41989334 0.44770256 0.2898972 0.20077601 0.40074723
0.46383436 0.19596511 0.33431109 0.22334703]
mean value: 0.33190850655910975
key: test_fscore
value: [0.29370629 0.47272727 0.64485981 0.22222222 0.15 0.45882353
0.56790123 0.07142857 0.38571429 0.10169492]
mean value: 0.3369078138116661
key: train_fscore
value: [0.37637795 0.50723639 0.60864865 0.26352531 0.12835249 0.50135501
0.64292237 0.13333333 0.36102236 0.17407407]
mean value: 0.36968479455376013
key: test_precision
value: [0.6 0.68421053 0.64485981 0.73684211 0.69230769 0.62903226
0.51111111 1. 0.84375 0.6 ]
mean value: 0.6942113506146379
key: train_precision
value: [0.79401993 0.76348548 0.63977273 0.85795455 0.90540541 0.72978304
0.57657658 0.86419753 0.79858657 0.84684685]
mean value: 0.7776628653067048
key: test_recall
value: [0.19444444 0.36111111 0.64485981 0.13084112 0.08411215 0.36111111
0.63888889 0.03703704 0.25 0.05555556]
mean value: 0.2757961232260297
key: train_recall
value: [0.24664603 0.37977296 0.58041237 0.1556701 0.06907216 0.38183695
0.72652219 0.07223942 0.23323013 0.09700722]
mean value: 0.29424095411360424
key: test_accuracy
value: [0.71225071 0.75213675 0.78285714 0.72 0.70857143 0.73714286
0.7 0.70285714 0.75428571 0.69714286]
mean value: 0.7267244607244608
key: train_accuracy
value: [0.74865122 0.77308791 0.77030457 0.7322335 0.71129442 0.76649746
0.75190355 0.71129442 0.74619289 0.71700508]
mean value: 0.7428465018759656
key: test_roc_auc
value: [0.56841564 0.64351852 0.74424061 0.55513249 0.53382562 0.63303489
0.68308081 0.51851852 0.61466942 0.51951331]
mean value: 0.6013949836957956
key: train_roc_auc
value: [0.60911586 0.66376366 0.71756641 0.57210636 0.53293205 0.65953964
0.74484607 0.53360024 0.60355964 0.54460989]
mean value: 0.6181639812952449
key: test_jcc
value: [0.17213115 0.30952381 0.47586207 0.125 0.08108108 0.29770992
0.39655172 0.03703704 0.23893805 0.05357143]
mean value: 0.21874062736192554
key: train_jcc
value: [0.23181377 0.33979686 0.43745144 0.15175879 0.06857728 0.33453888
0.47375505 0.07142857 0.2202729 0.09533469]
mean value: 0.24247282298687733
MCC on Blind test: -0.02
Accuracy on Blind test: 0.79
Running classifier: 24
Model_name: XGBoost
Model func: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
from pandas import MultiIndex, Int64Index
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
from pandas import MultiIndex, Int64Index
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
from pandas import MultiIndex, Int64Index
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
from pandas import MultiIndex, Int64Index
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
from pandas import MultiIndex, Int64Index
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
from pandas import MultiIndex, Int64Index
/home/tanu/git/LSHTM_analysis/scripts/ml/ml_functions/MultClfs_logo_skf.py:419: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
scoresDF_CV['source_data'] = 'CV'
/home/tanu/git/LSHTM_analysis/scripts/ml/ml_functions/MultClfs_logo_skf.py:446: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
scoresDF_BT['source_data'] = 'BT'
XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
colsample_bynode=None, colsample_bytree=None,
enable_categorical=False, gamma=None, gpu_id=None,
importance_type=None, interaction_constraints=None,
learning_rate=None, max_delta_step=None, max_depth=None,
min_child_weight=None, missing=nan, monotone_constraints=None,
n_estimators=100, n_jobs=None, num_parallel_tree=None,
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
scale_pos_weight=None, subsample=None, tree_method=None,
use_label_encoder=False, validate_parameters=None, verbosity=0)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', '...
interaction_constraints=None, learning_rate=None,
max_delta_step=None, max_depth=None,
min_child_weight=None, missing=nan,
monotone_constraints=None, n_estimators=100,
n_jobs=None, num_parallel_tree=None,
predictor=None, random_state=42, reg_alpha=None,
reg_lambda=None, scale_pos_weight=None,
subsample=None, tree_method=None,
use_label_encoder=False,
validate_parameters=None, verbosity=0))])
key: fit_time
value: [0.57454109 0.50616693 0.41509223 0.41727304 0.57601357 0.43081117
0.43102908 0.42782593 0.42852306 0.56878781]
mean value: 0.47760639190673826
key: score_time
value: [0.01263714 0.01339722 0.01256919 0.01259017 0.01265049 0.01262379
0.01304245 0.012918 0.01240897 0.01301384]
mean value: 0.012785124778747558
key: test_mcc
value: [0.46337468 0.42178651 0.45726051 0.38218923 0.42319613 0.46047208
0.38276434 0.35093265 0.52808188 0.45648379]
mean value: 0.4326541788435946
key: train_mcc
value: [1. 1. 1. 1. 1. 1.
1. 1. 0.99925503 1. ]
mean value: 0.9999255025167125
key: test_fscore
value: [0.61764706 0.57291667 0.61 0.54450262 0.59223301 0.61386139
0.55102041 0.50549451 0.64550265 0.59685864]
mean value: 0.5850036937042467
key: train_fscore
value: [1. 1. 1. 1. 1. 1.
1. 1. 0.99948374 1. ]
mean value: 0.9999483737738771
key: test_precision
value: [0.65625 0.6547619 0.65591398 0.61904762 0.61616162 0.65957447
0.61363636 0.62162162 0.75308642 0.68674699]
mean value: 0.6536800979513748
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.58333333 0.50925926 0.57009346 0.48598131 0.57009346 0.57407407
0.5 0.42592593 0.56481481 0.52777778]
mean value: 0.5311353409484251
key: train_recall
value: [1. 1. 1. 1. 1. 1.
1. 1. 0.99896801 1. ]
mean value: 0.9998968008255934
key: test_accuracy
value: [0.77777778 0.76638177 0.77714286 0.75142857 0.76 0.77714286
0.74857143 0.74285714 0.80857143 0.78 ]
mean value: 0.768987382987383
key: train_accuracy
value: [1. 1. 1. 1. 1. 1.
1. 1. 0.99968274 1. ]
mean value: 0.9999682741116752
key: test_roc_auc
value: [0.72376543 0.69495885 0.71920311 0.67714703 0.70685743 0.72092133
0.67975207 0.65511172 0.74108509 0.71016988]
mean value: 0.7028971946724236
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1.
0.999484 1. ]
mean value: 0.9999484004127966
key: test_jcc
value: [0.44680851 0.40145985 0.43884892 0.37410072 0.42068966 0.44285714
0.38028169 0.33823529 0.4765625 0.42537313]
mean value: 0.41452174215570725
key: train_jcc
value: [1. 1. 1. 1. 1. 1.
1. 1. 0.99896801 1. ]
mean value: 0.9998968008255934
MCC on Blind test: -0.05
Accuracy on Blind test: 0.73
Extracting tts_split_name: logo_skf_BT_gid
Total cols in each df:
CV df: 8
metaDF: 17
Adding column: Model_name
Total cols in bts df:
BT_df: 8
First proceeding to rowbind CV and BT dfs:
Final output should have: 25 columns
Combinig 2 using pd.concat by row ~ rowbind
Checking Dims of df to combine:
Dim of CV: (24, 8)
Dim of BT: (24, 8)
8
Number of Common columns: 8
These are: ['Precision', 'Accuracy', 'source_data', 'F1', 'Recall', 'MCC', 'ROC_AUC', 'JCC']
Concatenating dfs with different resampling methods [WF]:
Split type: logo_skf_BT_gid
No. of dfs combining: 2
PASS: 2 dfs successfully combined
nrows in combined_df_wf: 48
ncols in combined_df_wf: 8
PASS: proceeding to merge metadata with CV and BT dfs
Adding column: Model_name
=========================================================
SUCCESS: Ran multiple classifiers
=======================================================
BTS gene: embb
Total genes: 6
Training on: 4
Training on genes: ['katg', 'pnca', 'gid', 'rpob']
Omitted genes: ['alr', 'embb']
Blind test gene: embb
/home/tanu/git/Data/ml_combined/5genes_logo_skf_BT_embb.csv
Training data dim: (2904, 171)
Training Target dim: (2904,)
Checked training df does NOT have Target var
TEST data dim: (858, 171)
TEST Target dim: (858,)
==============================================================
Running several classification models (n): 24
List of models:
('AdaBoost Classifier', AdaBoostClassifier(random_state=42))
('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42))
('Decision Tree', DecisionTreeClassifier(random_state=42))
('Extra Tree', ExtraTreeClassifier(random_state=42))
('Extra Trees', ExtraTreesClassifier(random_state=42))
('Gradient Boosting', GradientBoostingClassifier(random_state=42))
('Gaussian NB', GaussianNB())
('Gaussian Process', GaussianProcessClassifier(random_state=42))
('K-Nearest Neighbors', KNeighborsClassifier())
('LDA', LinearDiscriminantAnalysis())
('Logistic Regression', LogisticRegression(random_state=42))
('Logistic RegressionCV', LogisticRegressionCV(cv=3, random_state=42))
('MLP', MLPClassifier(max_iter=500, random_state=42))
('Multinomial', MultinomialNB())
('Naive Bayes', BernoulliNB())
('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42))
('QDA', QuadraticDiscriminantAnalysis())
('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42))
('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
n_estimators=1000, n_jobs=10, oob_score=True,
random_state=42))
('Ridge Classifier', RidgeClassifier(random_state=42))
('Ridge ClassifierCV', RidgeClassifierCV(cv=3))
('SVC', SVC(random_state=42))
('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42))
('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
colsample_bynode=None, colsample_bytree=None,
enable_categorical=False, gamma=None, gpu_id=None,
importance_type=None, interaction_constraints=None,
learning_rate=None, max_delta_step=None, max_depth=None,
min_child_weight=None, missing=nan, monotone_constraints=None,
n_estimators=100, n_jobs=None, num_parallel_tree=None,
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
scale_pos_weight=None, subsample=None, tree_method=None,
use_label_encoder=False, validate_parameters=None, verbosity=0))
================================================================
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
Running classifier: 1
Model_name: AdaBoost Classifier
Model func: AdaBoostClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', AdaBoostClassifier(random_state=42))])
key: fit_time
value: [0.66365051 0.66575408 0.66804361 0.65876889 0.67223382 0.67139816
0.66538143 0.67777133 0.67934632 0.64426923]
mean value: 0.6666617393493652
key: score_time
value: [0.01856136 0.01914525 0.01830316 0.01946712 0.01901126 0.01961923
0.01893401 0.0195272 0.01821613 0.01812053]
mean value: 0.018890523910522462
key: test_mcc
value: [0.51694624 0.49003349 0.34451928 0.45262133 0.40016633 0.37951317
0.37278096 0.41628503 0.41510941 0.47590907]
mean value: 0.42638843083939293
key: train_mcc
value: [0.52812285 0.54098113 0.53517683 0.54470643 0.52569201 0.54169353
0.52250587 0.53204252 0.52836438 0.52151703]
mean value: 0.532080259326573
key: test_fscore
value: [0.67368421 0.65979381 0.53932584 0.63589744 0.58100559 0.57142857
0.56830601 0.57647059 0.59782609 0.64550265]
mean value: 0.6049240793197544
key: train_fscore
value: [0.67588695 0.68395657 0.67917676 0.68821065 0.67543335 0.68360557
0.67109234 0.67867868 0.67156863 0.66748166]
mean value: 0.6775091156393931
key: test_precision
value: [0.7032967 0.67368421 0.60759494 0.64583333 0.64197531 0.61904762
0.61176471 0.68055556 0.64705882 0.67777778]
mean value: 0.6508588974299906
key: train_precision
value: [0.7242268 0.73540856 0.73333333 0.73341837 0.71974522 0.73856209
0.72301691 0.72715573 0.73557047 0.72897196]
mean value: 0.72994094441912
key: test_recall
value: [0.64646465 0.64646465 0.48484848 0.62626263 0.53061224 0.53061224
0.53061224 0.5 0.55555556 0.61616162]
mean value: 0.5667594310451454
key: train_recall
value: [0.63359639 0.63923337 0.632469 0.64825254 0.63626126 0.63626126
0.62612613 0.63626126 0.61781285 0.61555806]
mean value: 0.6321832119605514
key: test_accuracy
value: [0.78694158 0.77319588 0.71821306 0.75601375 0.74137931 0.73103448
0.72758621 0.75172414 0.74482759 0.76896552]
mean value: 0.7499881502547694
key: train_accuracy
value: [0.79372369 0.79946422 0.79716801 0.80061232 0.79227238 0.79992349
0.79150727 0.79533282 0.79495027 0.79188982]
mean value: 0.7956844287771899
key: test_roc_auc
value: [0.75291982 0.74250316 0.66169508 0.72458965 0.68978529 0.68197279
0.67936862 0.69010417 0.69924375 0.73216458]
mean value: 0.7054346893445622
key: train_roc_auc
value: [0.75480515 0.76052051 0.75713832 0.76358166 0.75439946 0.7601932
0.7513597 0.75671696 0.7518711 0.74900659]
mean value: 0.7559592657577449
key: test_jcc
value: [0.50793651 0.49230769 0.36923077 0.46616541 0.40944882 0.4
0.39694656 0.40495868 0.42635659 0.4765625 ]
mean value: 0.4349913533625175
key: train_jcc
value: [0.51044505 0.51970669 0.51420715 0.52463504 0.5099278 0.51930147
0.50499546 0.51363636 0.50553506 0.50091743]
mean value: 0.5123307504239909
MCC on Blind test: 0.25
Accuracy on Blind test: 0.81
Running classifier: 2
Model_name: Bagging Classifier
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model',
BaggingClassifier(n_jobs=10, oob_score=True,
random_state=42))])
key: fit_time
value: [0.31062675 0.35606527 0.33549571 0.37351727 0.36279917 0.36105871
0.36182189 0.37128687 0.37850785 0.34757042]
mean value: 0.35587499141693113
key: score_time
value: [0.04013014 0.03965807 0.04135561 0.04093432 0.04084921 0.04527617
0.0504334 0.04320717 0.03396845 0.02785206]
mean value: 0.04036645889282227
key: test_mcc
value: [0.41518229 0.40262551 0.33764269 0.39831055 0.39941018 0.37510417
0.41332234 0.32296567 0.48931288 0.39913307]
mean value: 0.39530093342262124
key: train_mcc
value: [0.95996862 0.96508174 0.96088173 0.95143326 0.94475033 0.95846168
0.96598373 0.95915266 0.96424116 0.95750977]
mean value: 0.9587464679232799
key: test_fscore
value: [0.58285714 0.58100559 0.53631285 0.59259259 0.56470588 0.5497076
0.56626506 0.49382716 0.63636364 0.58695652]
mean value: 0.5690594034733605
key: train_fscore
value: [0.97294185 0.97645032 0.97347174 0.96707106 0.9622751 0.97165992
0.97703789 0.97238205 0.97586207 0.97109827]
mean value: 0.9720250259785381
key: test_precision
value: [0.67105263 0.65 0.6 0.62222222 0.66666667 0.64383562
0.69117647 0.625 0.72727273 0.63529412]
mean value: 0.6532520452414214
key: train_precision
value: [0.99411765 0.99531616 0.99645809 0.99170616 0.99281437 0.99881094
0.99648712 0.99411765 0.99531067 0.99644128]
mean value: 0.9951580081294751
key: test_recall
value: [0.51515152 0.52525253 0.48484848 0.56565657 0.48979592 0.47959184
0.47959184 0.40816327 0.56565657 0.54545455]
mean value: 0.5059163059163059
key: train_recall
value: [0.95264938 0.95828636 0.95152198 0.94363021 0.93355856 0.94594595
0.95833333 0.95157658 0.95715896 0.9470124 ]
mean value: 0.9499673715429072
key: test_accuracy
value: [0.74914089 0.74226804 0.71477663 0.73539519 0.74482759 0.73448276
0.75172414 0.71724138 0.77931034 0.73793103]
mean value: 0.7407097997393056
key: train_accuracy
value: [0.98201301 0.98430922 0.98239571 0.97818599 0.97513389 0.98125478
0.98469778 0.98163734 0.98393267 0.98087223]
mean value: 0.9814432633489606
key: test_roc_auc
value: [0.69247159 0.6897096 0.65909091 0.69428662 0.68239796 0.67208759
0.68510842 0.64158163 0.72785446 0.69157544]
mean value: 0.68361642084646
key: train_roc_auc
value: [0.97487625 0.97798443 0.97489193 0.9697873 0.96504116 0.97268329
0.97829761 0.97433985 0.9774214 0.97263764]
mean value: 0.9737960859241641
key: test_jcc
value: [0.41129032 0.40944882 0.36641221 0.42105263 0.39344262 0.37903226
0.39495798 0.32786885 0.46666667 0.41538462]
mean value: 0.39855569855165995
key: train_jcc
value: [0.94730942 0.95398429 0.94831461 0.93624161 0.92729306 0.94488189
0.95510662 0.9462486 0.95286195 0.94382022]
mean value: 0.9456062276056851
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/joblib/externals/loky/process_executor.py:702: UserWarning: A worker stopped while some jobs were given to the executor. This can be caused by a too short worker timeout or by a memory leak.
warnings.warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
MCC on Blind test: 0.24
Accuracy on Blind test: 0.82
Running classifier: 3
Model_name: Decision Tree
Model func: DecisionTreeClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', DecisionTreeClassifier(random_state=42))])
key: fit_time
value: [0.17962861 0.17077065 0.17199659 0.16666722 0.17193794 0.18516755
0.17313027 0.19138479 0.1730268 0.17234254]
mean value: 0.17560529708862305
key: score_time
value: [0.00999212 0.0099709 0.01004863 0.00997066 0.01124072 0.01087618
0.01043844 0.0109086 0.01027513 0.00997806]
mean value: 0.01036994457244873
key: test_mcc
value: [0.32133783 0.28396523 0.26665985 0.2440885 0.26974237 0.22580778
0.23378416 0.37736364 0.33448618 0.29796639]
mean value: 0.2855201923875052
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.57272727 0.5462963 0.52216749 0.51207729 0.5177665 0.4950495
0.50485437 0.58883249 0.56281407 0.54 ]
mean value: 0.5362585280400154
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.52066116 0.5042735 0.50961538 0.49074074 0.51515152 0.48076923
0.48148148 0.58585859 0.56 0.53465347]
mean value: 0.5183205065261772
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.63636364 0.5959596 0.53535354 0.53535354 0.52040816 0.51020408
0.53061224 0.59183673 0.56565657 0.54545455]
mean value: 0.556720263863121
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.67697595 0.66323024 0.66666667 0.65292096 0.67241379 0.64827586
0.64827586 0.72068966 0.7 0.68275862]
mean value: 0.6732207607536438
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: [0.66714015 0.64693813 0.63486427 0.6244476 0.63520408 0.61447704
0.61947279 0.68914753 0.66764504 0.64969062]
mean value: 0.644902725736098
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.40127389 0.37579618 0.35333333 0.34415584 0.34931507 0.32894737
0.33766234 0.41726619 0.39160839 0.36986301]
mean value: 0.36692216081173673
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: 0.16
Accuracy on Blind test: 0.69
Running classifier: 4
Model_name: Extra Tree
Model func: ExtraTreeClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', ExtraTreeClassifier(random_state=42))])
key: fit_time
value: [0.02073741 0.02081442 0.02074409 0.02116275 0.02081633 0.02068639
0.02078271 0.0213716 0.02059913 0.02075219]
mean value: 0.02084670066833496
key: score_time
value: [0.01018214 0.00987315 0.01078105 0.00992274 0.00992489 0.00986624
0.00984359 0.00984716 0.00989318 0.00996542]
mean value: 0.010009956359863282
key: test_mcc
value: [0.22752532 0.21565111 0.2343489 0.262931 0.24886365 0.2372849
0.18987267 0.24817115 0.20071387 0.24486088]
mean value: 0.23102234622395318
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.5047619 0.49019608 0.47567568 0.52631579 0.51456311 0.48958333
0.4729064 0.48648649 0.47761194 0.5 ]
mean value: 0.49381007191979676
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.47747748 0.47619048 0.51162791 0.5 0.49074074 0.5
0.45714286 0.51724138 0.47058824 0.50515464]
mean value: 0.49061637123080165
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.53535354 0.50505051 0.44444444 0.55555556 0.54081633 0.47959184
0.48979592 0.45918367 0.48484848 0.49494949]
mean value: 0.49895897753040613
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.64261168 0.64261168 0.66666667 0.65979381 0.65517241 0.66206897
0.63103448 0.67241379 0.63793103 0.66206897]
mean value: 0.6532373503969666
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: [0.6166351 0.60929609 0.61284722 0.63454861 0.627179 0.61740009
0.59646046 0.62021684 0.60106299 0.6218203 ]
mean value: 0.6157466680845748
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.33757962 0.32467532 0.31205674 0.35714286 0.34640523 0.32413793
0.30967742 0.32142857 0.31372549 0.33333333]
mean value: 0.32801625113467037
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: 0.09
Accuracy on Blind test: 0.68
Running classifier: 5
Model_name: Extra Trees
Model func: ExtraTreesClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', ExtraTreesClassifier(random_state=42))])
key: fit_time
value: [0.45524931 0.46012402 0.45626187 0.45277762 0.44455504 0.46801829
0.45031834 0.49884367 0.45119381 0.4450655 ]
mean value: 0.4582407474517822
key: score_time
value: [0.0251534 0.02579212 0.02628255 0.0237 0.02420616 0.02421689
0.02424645 0.02621198 0.02422071 0.02423644]
mean value: 0.024826669692993165
key: test_mcc
value: [0.41518229 0.41176179 0.3798193 0.21404248 0.37311432 0.38067205
0.37311432 0.42761706 0.37404015 0.42791405]
mean value: 0.3777277799996374
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.58285714 0.58888889 0.5308642 0.4494382 0.5443787 0.54761905
0.5443787 0.5625 0.55681818 0.58479532]
mean value: 0.5492538379048446
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.67105263 0.65432099 0.68253968 0.50632911 0.64788732 0.65714286
0.64788732 0.72580645 0.63636364 0.69444444]
mean value: 0.6523774453148168
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.51515152 0.53535354 0.43434343 0.4040404 0.46938776 0.46938776
0.46938776 0.45918367 0.49494949 0.50505051]
mean value: 0.47562358276643996
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.74914089 0.74570447 0.73883162 0.66323024 0.73448276 0.73793103
0.73448276 0.75862069 0.73103448 0.75517241]
mean value: 0.734863135442588
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: [0.69247159 0.6947601 0.66508838 0.6004577 0.66958971 0.67219388
0.66958971 0.685321 0.67417632 0.69493363]
mean value: 0.6718582028142845
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.41129032 0.41732283 0.36134454 0.28985507 0.37398374 0.37704918
0.37398374 0.39130435 0.38582677 0.41322314]
mean value: 0.3795183687483372
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: 0.22
Accuracy on Blind test: 0.81
Running classifier: 6
Model_name: Gradient Boosting
Model func: GradientBoostingClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', GradientBoostingClassifier(random_state=42))])
key: fit_time
value: [3.21052718 3.15439987 3.12943339 3.12990713 3.14807868 3.08285475
3.08462644 3.10303879 3.11946392 3.10444546]
mean value: 3.1266775608062742
key: score_time
value: [0.01072454 0.01038861 0.01094365 0.01080275 0.01035357 0.01015162
0.0106678 0.01054573 0.01034904 0.01042771]
mean value: 0.010535502433776855
key: test_mcc
value: [0.520484 0.50335025 0.32563281 0.43871881 0.50651992 0.46188748
0.46516114 0.4758888 0.44909943 0.46064885]
mean value: 0.4607391484026041
key: train_mcc
value: [0.65433982 0.64779057 0.66320916 0.66220213 0.66887501 0.64906194
0.66373958 0.66262833 0.66317528 0.63859587]
mean value: 0.6573617692359148
key: test_fscore
value: [0.67027027 0.66666667 0.52272727 0.62105263 0.65555556 0.61714286
0.62569832 0.61988304 0.62365591 0.64285714]
mean value: 0.6265509675735227
key: train_fscore
value: [0.75692308 0.75031056 0.76190476 0.76588022 0.76561534 0.75200989
0.76452599 0.76202219 0.76012461 0.74770922]
mean value: 0.7587025871018334
key: test_precision
value: [0.72093023 0.68817204 0.5974026 0.64835165 0.7195122 0.7012987
0.69135802 0.7260274 0.66666667 0.64948454]
mean value: 0.6809204042444564
key: train_precision
value: [0.83333333 0.83540802 0.84383562 0.82637076 0.84910837 0.8340192
0.83668005 0.84196185 0.84958217 0.816 ]
mean value: 0.8366299380208829
key: test_recall
value: [0.62626263 0.64646465 0.46464646 0.5959596 0.60204082 0.55102041
0.57142857 0.54081633 0.58585859 0.63636364]
mean value: 0.5820861678004536
key: train_recall
value: [0.69334837 0.68094701 0.69447576 0.71364149 0.69707207 0.68468468
0.70382883 0.69594595 0.68771139 0.68996618]
mean value: 0.6941621723188804
key: test_accuracy
value: [0.79037801 0.78006873 0.71134021 0.75257732 0.7862069 0.76896552
0.76896552 0.77586207 0.75862069 0.75862069]
mean value: 0.7651605640478729
key: train_accuracy
value: [0.84883276 0.84615385 0.85265978 0.85189437 0.85501148 0.84659526
0.85271614 0.85233359 0.85271614 0.84200459]
mean value: 0.8500917957443669
key: test_roc_auc
value: [0.75063131 0.74771149 0.65159407 0.71464646 0.74112457 0.71561437
0.72061012 0.71832483 0.71701306 0.72917658]
mean value: 0.7206446873033682
key: train_roc_auc
value: [0.81104266 0.80600074 0.81421355 0.81829235 0.81667045 0.8072902
0.81657258 0.81436926 0.8125876 0.80502941]
mean value: 0.8122068811636034
key: test_jcc
value: [0.50406504 0.5 0.35384615 0.45038168 0.48760331 0.44628099
0.45528455 0.44915254 0.453125 0.47368421]
mean value: 0.457342347715126
key: train_jcc
value: [0.60891089 0.60039761 0.61538462 0.62058824 0.62024048 0.60257681
0.61881188 0.61553785 0.61306533 0.59707317]
mean value: 0.6112586872923956
MCC on Blind test: 0.22
Accuracy on Blind test: 0.8
Running classifier: 7
Model_name: Gaussian NB
Model func: GaussianNB()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', GaussianNB())])
key: fit_time
value: [0.01982689 0.01945353 0.02013445 0.01949501 0.0208087 0.01932788
0.02013111 0.0194335 0.01981068 0.0195334 ]
mean value: 0.01979551315307617
key: score_time
value: [0.0102787 0.01025414 0.01032639 0.01018405 0.0102067 0.01024723
0.01017952 0.01017046 0.01014876 0.01023722]
mean value: 0.010223317146301269
key: test_mcc
value: [0.32849977 0.26624053 0.2439226 0.15943011 0.25535708 0.3424575
0.34980087 0.33730834 0.22084507 0.28361574]
mean value: 0.2787477610802362
key: train_mcc
value: [0.28581481 0.28311637 0.29076429 0.30036434 0.2892036 0.2833415
0.27585964 0.28828135 0.27768183 0.28581657]
mean value: 0.2860244305574977
key: test_fscore
value: [0.592 0.55371901 0.53448276 0.49593496 0.53744493 0.59130435
0.60162602 0.58874459 0.52892562 0.56557377]
mean value: 0.5589756003312802
key: train_fscore
value: [0.56445672 0.56100982 0.5651359 0.57274401 0.56751825 0.56186047
0.55566038 0.56631482 0.55855019 0.56305258]
mean value: 0.5636303131786422
key: test_precision
value: [0.49006623 0.46853147 0.46616541 0.41496599 0.47286822 0.51515152
0.5 0.5112782 0.44755245 0.47586207]
mean value: 0.47624415378378887
key: train_precision
value: [0.4770428 0.47923323 0.48356055 0.48404669 0.47699387 0.47860539
0.47808442 0.47792409 0.47509881 0.47939778]
mean value: 0.4789987620578501
key: test_recall
value: [0.74747475 0.67676768 0.62626263 0.61616162 0.62244898 0.69387755
0.75510204 0.69387755 0.64646465 0.6969697 ]
mean value: 0.677540713254999
key: train_recall
value: [0.69109357 0.67643743 0.67981962 0.70124014 0.70045045 0.68018018
0.66328829 0.69481982 0.67756483 0.68207441]
mean value: 0.6846968727464782
key: test_accuracy
value: [0.64948454 0.62886598 0.62886598 0.57388316 0.63793103 0.67586207
0.66206897 0.67241379 0.60689655 0.63448276]
mean value: 0.6370754828771182
key: train_accuracy
value: [0.63796403 0.64064294 0.64485266 0.64485266 0.63733741 0.63963275
0.63963275 0.63848508 0.6365723 0.64078041]
mean value: 0.6400752988632261
key: test_roc_auc
value: [0.67321654 0.64046717 0.62823548 0.58412247 0.63414116 0.68027211
0.68484269 0.67766794 0.61642604 0.64953197]
mean value: 0.6468923570637997
key: train_roc_auc
value: [0.65087703 0.6493427 0.65335129 0.6585575 0.6526586 0.64947595
0.64537531 0.65216078 0.64654153 0.65082296]
mean value: 0.6509163654071031
key: test_jcc
value: [0.42045455 0.38285714 0.36470588 0.32972973 0.36746988 0.41975309
0.43023256 0.41717791 0.35955056 0.39428571]
mean value: 0.38862170146656155
key: train_jcc
value: [0.39320077 0.38986355 0.39386022 0.40129032 0.39617834 0.39068564
0.38471587 0.3950064 0.38749194 0.39183938]
mean value: 0.3924132439400982
MCC on Blind test: 0.14
Accuracy on Blind test: 0.66
Running classifier: 8
Model_name: Gaussian Process
Model func: GaussianProcessClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', GaussianProcessClassifier(random_state=42))])
key: fit_time
value: [3.01830745 2.91645956 3.04634142 2.98693275 2.92737675 3.01557112
2.81716537 2.81007552 3.05465436 2.94832063]
mean value: 2.9541204929351808
key: score_time
value: [0.08861589 0.08911896 0.0884912 0.088521 0.09943581 0.08796096
0.08785939 0.08782911 0.09619236 0.08792353]
mean value: 0.09019482135772705
key: test_mcc
value: [0.36910463 0.33491823 0.31555906 0.21984092 0.25758523 0.32237068
0.36009768 0.37896118 0.24048806 0.3210052 ]
mean value: 0.31199308690070665
key: train_mcc
value: [0.64288585 0.645032 0.65420435 0.65249687 0.64445053 0.6464353
0.64657268 0.65757675 0.64470462 0.63920136]
mean value: 0.6473560330184038
key: test_fscore
value: [0.51572327 0.49689441 0.43835616 0.4 0.39160839 0.46357616
0.49006623 0.50331126 0.4 0.47435897]
mean value: 0.4573894853113174
key: train_fscore
value: [0.71654084 0.71752577 0.72826087 0.729237 0.7198364 0.71840659
0.72081911 0.72777018 0.71830021 0.71340206]
mean value: 0.7210099034290858
key: test_precision
value: [0.68333333 0.64516129 0.68085106 0.55357143 0.62222222 0.66037736
0.69811321 0.71698113 0.58823529 0.64912281]
mean value: 0.6497969137527749
key: train_precision
value: [0.91578947 0.91901408 0.91623932 0.90909091 0.9119171 0.92077465
0.91507799 0.92682927 0.91608392 0.91373239]
mean value: 0.9164549098198582
key: test_recall
value: [0.41414141 0.4040404 0.32323232 0.31313131 0.28571429 0.35714286
0.37755102 0.3877551 0.3030303 0.37373737]
mean value: 0.3539476396619254
key: train_recall
value: [0.58850056 0.58850056 0.6042841 0.60879369 0.59459459 0.58896396
0.59459459 0.5990991 0.59075536 0.58511838]
mean value: 0.5943204901632185
key: test_accuracy
value: [0.73539519 0.72164948 0.71821306 0.68041237 0.7 0.72068966
0.73448276 0.74137931 0.68965517 0.71724138]
mean value: 0.7159118378954853
key: train_accuracy
value: [0.84194413 0.84270953 0.84691925 0.84653655 0.8427697 0.84315226
0.84353481 0.84774292 0.8427697 0.84047437]
mean value: 0.8438553217082149
key: test_roc_auc
value: [0.65759154 0.64472854 0.62255366 0.59146149 0.59858631 0.63169643
0.64710884 0.65481505 0.59654133 0.63451267]
mean value: 0.627959585537769
key: train_roc_auc
value: [0.7803453 0.78092467 0.78794738 0.78875374 0.78252325 0.78144606
0.78310263 0.78738269 0.78148075 0.77837274]
mean value: 0.7832279208279352
key: test_jcc
value: [0.34745763 0.33057851 0.28070175 0.25 0.24347826 0.30172414
0.3245614 0.33628319 0.25 0.31092437]
mean value: 0.2975709251799282
key: train_jcc
value: [0.55828877 0.55948553 0.57264957 0.5738576 0.56230032 0.56055734
0.56350053 0.57204301 0.56042781 0.55448718]
mean value: 0.5637597664290424
MCC on Blind test: 0.15
Accuracy on Blind test: 0.81
Running classifier: 9
Model_name: K-Nearest Neighbors
Model func: KNeighborsClassifier()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', KNeighborsClassifier())])
key: fit_time
value: [0.02127504 0.01794124 0.01649594 0.01836824 0.01810288 0.01671863
0.01776671 0.01594663 0.01639152 0.01636553]
mean value: 0.017537236213684082
key: score_time
value: [0.04720855 0.027421 0.02700019 0.02791834 0.03444815 0.02814627
0.02646971 0.02549362 0.02720141 0.02756143]
mean value: 0.029886865615844728
key: test_mcc
value: [0.21703871 0.18064813 0.20700053 0.12045868 0.13382417 0.23425417
0.2802975 0.31892281 0.0999762 0.16627982]
mean value: 0.19587007186982558
key: train_mcc
value: [0.48659714 0.50197679 0.50839588 0.5127142 0.50626163 0.49625302
0.49907064 0.47860242 0.49631876 0.49827926]
mean value: 0.49844697398700955
key: test_fscore
value: [0.44571429 0.44086022 0.41463415 0.38888889 0.36585366 0.44311377
0.49142857 0.52513966 0.38709677 0.41142857]
mean value: 0.43141585488452366
key: train_fscore
value: [0.63178047 0.64698492 0.65379826 0.65626949 0.64813645 0.64370695
0.64076433 0.63065327 0.64231738 0.64402516]
mean value: 0.6438436683536837
key: test_precision
value: [0.51315789 0.47126437 0.52307692 0.43209877 0.45454545 0.53623188
0.55844156 0.58024691 0.4137931 0.47368421]
mean value: 0.4956541075661778
key: train_precision
value: [0.72794118 0.73049645 0.73018081 0.73463687 0.7381295 0.72496474
0.73753666 0.71306818 0.7275321 0.72830725]
mean value: 0.7292793734364607
key: test_recall
value: [0.39393939 0.41414141 0.34343434 0.35353535 0.30612245 0.37755102
0.43877551 0.47959184 0.36363636 0.36363636]
mean value: 0.3834364048649763
key: train_recall
value: [0.55806088 0.58060879 0.59188275 0.59301015 0.5777027 0.57882883
0.56644144 0.56531532 0.57497182 0.57722661]
mean value: 0.5764049280396517
key: test_accuracy
value: [0.66666667 0.64261168 0.67010309 0.62199313 0.64137931 0.67931034
0.69310345 0.70689655 0.60689655 0.64482759]
mean value: 0.6573788363550184
key: train_accuracy
value: [0.77918102 0.78492155 0.78721776 0.78913127 0.7869166 0.78232594
0.78423871 0.77505738 0.78270849 0.7834736 ]
mean value: 0.7835172322719285
key: test_roc_auc
value: [0.60061553 0.58727904 0.59098801 0.55697601 0.55931122 0.60544218
0.63084609 0.65125425 0.54831033 0.57710614]
mean value: 0.590812879570359
key: train_roc_auc
value: [0.72543832 0.73526384 0.73974207 0.74146452 0.73612829 0.73292542
0.73136672 0.72414086 0.7321877 0.7333151 ]
mean value: 0.7331972842058028
key: test_jcc
value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
[0.28676471 0.28275862 0.26153846 0.24137931 0.2238806 0.28461538
0.32575758 0.35606061 0.24 0.25899281]
mean value: 0.2761748067659185
key: train_jcc
value: [0.46175373 0.47818013 0.48566142 0.48839369 0.47943925 0.47460757
0.47141518 0.46055046 0.47309833 0.47495362]
mean value: 0.47480533855259804
MCC on Blind test: 0.08
Accuracy on Blind test: 0.73
Running classifier: 10
Model_name: LDA
Model func: LinearDiscriminantAnalysis()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', LinearDiscriminantAnalysis())])
key: fit_time
value: [0.10489011 0.10125518 0.11215854 0.10802674 0.10958433 0.10423803
0.1052928 0.10489464 0.10388088 0.10477424]
mean value: 0.10589954853057862
key: score_time
value: [0.01992393 0.01315284 0.0130918 0.0133059 0.01316381 0.01312304
0.01312828 0.01314855 0.01314402 0.01313138]
mean value: 0.013831353187561036
key: test_mcc
value: [0.46787194 0.49100108 0.38422728 0.3817573 0.36996545 0.41969814
0.42352269 0.48876656 0.36975324 0.46912464]
mean value: 0.4265688326138785
key: train_mcc
value: [0.50737631 0.50684529 0.5161719 0.51915742 0.49826786 0.51827385
0.50526021 0.49853199 0.50888727 0.50016682]
mean value: 0.5078938926160477
key: test_fscore
value: [0.63829787 0.66666667 0.56497175 0.58762887 0.55172414 0.5862069
0.59550562 0.64044944 0.56989247 0.64210526]
mean value: 0.6043448983337611
key: train_fscore
value: [0.65979381 0.65855143 0.66706876 0.66666667 0.65544197 0.66707095
0.65563725 0.65330905 0.66018237 0.65452338]
mean value: 0.6598245641689743
key: test_precision
value: [0.6741573 0.65686275 0.64102564 0.6 0.63157895 0.67105263
0.6625 0.7125 0.6091954 0.67032967]
mean value: 0.6529202341070357
key: train_precision
value: [0.71391076 0.71560847 0.71725032 0.72546419 0.70322581 0.72273325
0.71908602 0.7088274 0.71635884 0.70921053]
mean value: 0.7151675585530761
key: test_recall
value: [0.60606061 0.67676768 0.50505051 0.57575758 0.48979592 0.52040816
0.54081633 0.58163265 0.53535354 0.61616162]
mean value: 0.5647804576376004
key: train_recall
value: [0.61330327 0.60992108 0.62344983 0.61668546 0.61373874 0.61936937
0.60247748 0.60585586 0.61217587 0.60766629]
mean value: 0.6124643245274587
key: test_accuracy
value: [0.76632302 0.76975945 0.73539519 0.72508591 0.73103448 0.75172414
0.75172414 0.77931034 0.72413793 0.76551724]
mean value: 0.7500011849745231
key: train_accuracy
value: [0.78530425 0.78530425 0.78874856 0.79066207 0.78079572 0.78997705
0.78500383 0.78156083 0.78615149 0.78232594]
mean value: 0.785583397824602
key: test_roc_auc
value: [0.72750947 0.74723801 0.67960859 0.68892045 0.67198129 0.69509991
0.70009566 0.73092049 0.67867153 0.72954678]
mean value: 0.7049592187838962
key: train_roc_auc
value: [0.74349984 0.74267781 0.74857312 0.74837749 0.74024133 0.7485607
0.74069413 0.73890707 0.74384127 0.73984936]
mean value: 0.7435222103782972
key: test_jcc
value: [0.46875 0.5 0.39370079 0.41605839 0.38095238 0.41463415
0.424 0.47107438 0.39849624 0.47286822]
mean value: 0.434053454667706
key: train_jcc
value: [0.49230769 0.49092559 0.50045249 0.5 0.48747764 0.50045496
0.48769371 0.48512173 0.49274047 0.48646209]
mean value: 0.492363637566635
MCC on Blind test: 0.25
Accuracy on Blind test: 0.8
Running classifier: 11
Model_name: Logistic Regression
Model func: LogisticRegression(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', LogisticRegression(random_state=42))])
key: fit_time
value: [0.06990385 0.06841469 0.06582165 0.06808782 0.06786466 0.06922674
0.07514834 0.11026073 0.16376114 0.09373784]
mean value: 0.08522274494171142
key: score_time
value: [0.01768255 0.01717758 0.01585245 0.01589084 0.0168891 0.01604748
0.01596999 0.01733279 0.01961589 0.01383352]
mean value: 0.01662921905517578
key: test_mcc
value: [0.45689842 0.47313038 0.37713847 0.36049137 0.41202496 0.40869905
0.46188748 0.50270567 0.41510941 0.4311593 ]
mean value: 0.42992445086816533
key: train_mcc
value: [0.48288071 0.48169886 0.49216648 0.4955018 0.48310758 0.50013117
0.47006546 0.4765899 0.48376463 0.48396016]
mean value: 0.4849866752337674
key: test_fscore
value: [0.62702703 0.65326633 0.56179775 0.56842105 0.56097561 0.57309942
0.61714286 0.65945946 0.59782609 0.61780105]
mean value: 0.6036816639765918
key: train_fscore
value: [0.64233577 0.64107252 0.64927184 0.65256798 0.64367816 0.65410334
0.631062 0.63878788 0.64272672 0.64449819]
mean value: 0.6440104393703592
key: test_precision
value: [0.6744186 0.65 0.63291139 0.59340659 0.6969697 0.67123288
0.7012987 0.70114943 0.64705882 0.64130435]
mean value: 0.6609750462086402
key: train_precision
value: [0.69749009 0.69761273 0.70302234 0.703125 0.69542484 0.71070013
0.69365722 0.69160105 0.6984127 0.69491525]
mean value: 0.6985961354786829
key: test_recall
value: [0.58585859 0.65656566 0.50505051 0.54545455 0.46938776 0.5
0.55102041 0.62244898 0.55555556 0.5959596 ]
mean value: 0.5587301587301587
key: train_recall
value: [0.59526494 0.59301015 0.60315671 0.60879369 0.5990991 0.60585586
0.57882883 0.59346847 0.59526494 0.60090192]
mean value: 0.5973644585961384
key: test_accuracy
value: [0.7628866 0.7628866 0.73195876 0.71821306 0.75172414 0.74827586
0.76896552 0.78275862 0.74482759 0.74827586]
mean value: 0.7520772603389027
key: train_accuracy
value: [0.7749713 0.7745886 0.77879832 0.77994642 0.77467483 0.78232594
0.77008416 0.77199694 0.77543994 0.77505738]
mean value: 0.7757883819675093
key: test_roc_auc
value: [0.72001263 0.73713699 0.67700442 0.67637311 0.68261054 0.6875
0.71561437 0.74351616 0.69924375 0.71159236]
mean value: 0.7050604327682206
key: train_roc_auc
value: [0.73129411 0.73045641 0.73610906 0.73834818 0.73205245 0.73948644
0.72365543 0.72865776 0.73162205 0.73270342]
mean value: 0.7324385304546036
key: test_jcc
value: [0.45669291 0.48507463 0.390625 0.39705882 0.38983051 0.40163934
0.44628099 0.49193548 0.42635659 0.4469697 ]
mean value: 0.433246397824127
key: train_jcc
value: [0.47311828 0.47174888 0.48068284 0.48430493 0.47457627 0.48599819
0.46098655 0.46927872 0.4735426 0.47546833]
mean value: 0.4749705592453218
MCC on Blind test: 0.22
Accuracy on Blind test: 0.81
Running classifier: 12
Model_name: Logistic RegressionCV
Model func: LogisticRegressionCV(cv=3, random_state=42)
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', LogisticRegressionCV(cv=3, random_state=42))])
key: fit_time
value: [0.7989831 0.94550061 0.81592655 0.81866431 0.94806695 0.82416701
0.95044518 0.82881665 0.82229662 0.93121409]
mean value: 0.8684081077575684
key: score_time
value: [0.01349092 0.01334953 0.01338434 0.01345277 0.01347923 0.01343274
0.01340818 0.01344514 0.01336217 0.01350069]
mean value: 0.013430571556091309
key: test_mcc
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: train_mcc
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: test_fscore
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: train_fscore
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: test_precision
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: train_precision
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: test_recall
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: train_recall
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: test_accuracy
value: [0.65979381 0.65979381 0.65979381 0.65979381 0.66206897 0.66206897
0.66206897 0.66206897 0.65862069 0.65862069]
mean value: 0.6604692499111268
key: train_accuracy
value: [0.66054344 0.66054344 0.66054344 0.66054344 0.66029074 0.66029074
0.66029074 0.66029074 0.6606733 0.6606733 ]
mean value: 0.6604683310538122
key: test_roc_auc
value: [0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5]
mean value: 0.5
key: train_roc_auc
value: [0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5]
mean value: 0.5
key: test_jcc
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: train_jcc
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
MCC on Blind test: 0.0
Accuracy on Blind test: 0.85
Running classifier: 13
Model_name: MLP
Model func: MLPClassifier(max_iter=500, random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', MLPClassifier(max_iter=500, random_state=42))])
key: fit_time
value: [1.62501669 1.8834734 2.25642991 5.56029296 7.07325768 4.32152057
3.54884076 5.63865948 3.68770599 1.72877407]
mean value: 3.7323971509933473
key: score_time
value: [0.01478863 0.01706338 0.01473927 0.01379633 0.02177238 0.01366901
0.01530695 0.01376319 0.01895571 0.0136683 ]
mean value: 0.015752315521240234
key: test_mcc
value: [0.50130076 0.49680858 0.3817881 0.38787453 0.46887561 0.35986231
0.43574791 0.53664565 0.36623985 0.453412 ]
mean value: 0.43885552978148806
key: train_mcc
value: [0.48949837 0.50764745 0.52381886 0.5855848 0.63338528 0.54966042
0.48552839 0.57580872 0.48636253 0.50783259]
mean value: 0.53451274166028
key: test_fscore
value: [0.66315789 0.67906977 0.59803922 0.62184874 0.63387978 0.52439024
0.56050955 0.67045455 0.53571429 0.64356436]
mean value: 0.6130628384428581
key: train_fscore
value: [0.65615879 0.67420814 0.69181034 0.7361596 0.75854342 0.68280571
0.58139535 0.70349908 0.60346021 0.67164179]
mean value: 0.6759682431167203
key: test_precision
value: [0.69230769 0.62931034 0.58095238 0.5323741 0.68235294 0.65151515
0.74576271 0.75641026 0.65217391 0.63106796]
mean value: 0.6554227453981896
key: train_precision
value: [0.68038741 0.67650397 0.6625387 0.66010733 0.75473802 0.76071923
0.81967213 0.77327935 0.78136201 0.68421053]
mean value: 0.7253518674091145
key: test_recall
value: [0.63636364 0.73737374 0.61616162 0.74747475 0.59183673 0.43877551
0.44897959 0.60204082 0.45454545 0.65656566]
mean value: 0.5930117501546073
key: train_recall
value: [0.63359639 0.67192785 0.72378805 0.83201804 0.76238739 0.61936937
0.45045045 0.64527027 0.49154453 0.65952649]
mean value: 0.6489878830352336
key: test_accuracy
value: [0.78006873 0.7628866 0.71821306 0.69072165 0.76896552 0.73103448
0.76206897 0.8 0.73103448 0.75172414]
mean value: 0.7496717620571157
key: train_accuracy
value: [0.7745886 0.77956372 0.78109453 0.79755071 0.83511859 0.80451415
0.77964805 0.81522571 0.78079572 0.78117827]
mean value: 0.7929278040379001
key: test_roc_auc
value: [0.74526515 0.7567077 0.69349747 0.70446654 0.72560587 0.65949192
0.6854273 0.75154124 0.6644455 0.72880639]
mean value: 0.711525508585157
key: train_roc_auc
value: [0.74032079 0.75340309 0.76716633 0.80592791 0.81746252 0.75956881
0.69973276 0.7739677 0.7104509 0.75159301]
mean value: 0.7579593813556201
key: test_jcc
value: [0.49606299 0.51408451 0.42657343 0.45121951 0.464 0.3553719
0.38938053 0.5042735 0.36585366 0.47445255]
mean value: 0.4441272587291299
key: train_jcc
value: [0.48827107 0.50853242 0.52883031 0.5824783 0.61101083 0.51837889
0.40983607 0.54261364 0.432111 0.50561798]
mean value: 0.512768049866761
MCC on Blind test: 0.25
Accuracy on Blind test: 0.76
Running classifier: 14
Model_name: Multinomial
Model func: MultinomialNB()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', MultinomialNB())])
key: fit_time
value: [0.02292299 0.02328873 0.0234611 0.0231936 0.02348995 0.02327085
0.02322507 0.02327132 0.02361655 0.02337241]
mean value: 0.023311257362365723
key: score_time
value: [0.01330495 0.01304317 0.01348281 0.01316762 0.01320624 0.0131495
0.01328707 0.01327205 0.01328564 0.01323104]
mean value: 0.013243007659912109
key: test_mcc
value: [0.32455864 0.18813831 0.12503093 0.11530536 0.21396684 0.21464838
0.2202381 0.21732235 0.18113757 0.21654449]
mean value: 0.20168909630694257
key: train_mcc
value: [0.19134596 0.21305338 0.21350083 0.20950143 0.20150926 0.21749978
0.19521132 0.20693163 0.20226118 0.21081674]
mean value: 0.20616315184408868
key: test_fscore
value: [0.57009346 0.4784689 0.43137255 0.44545455 0.47959184 0.49019608
0.5046729 0.49760766 0.47619048 0.49760766]
mean value: 0.4871256051497198
key: train_fscore
value: [0.48085106 0.4919268 0.49247312 0.4984456 0.4913748 0.49625668
0.48209514 0.49503916 0.48720682 0.48946515]
mean value: 0.4905134347223229
key: test_precision
value: [0.53043478 0.45454545 0.41904762 0.40495868 0.47959184 0.47169811
0.46551724 0.46846847 0.45045045 0.47272727]
mean value: 0.46174399168554625
key: train_precision
value: [0.4551863 0.47064882 0.47070915 0.4611697 0.45853659 0.47250509
0.45879959 0.46153846 0.46208291 0.46991701]
mean value: 0.4641093625648348
key: test_recall
value: [0.61616162 0.50505051 0.44444444 0.49494949 0.47959184 0.51020408
0.55102041 0.53061224 0.50505051 0.52525253]
mean value: 0.5162337662337662
key: train_recall
value: [0.50958286 0.51521984 0.51634724 0.54227734 0.52927928 0.52252252
0.50788288 0.53378378 0.51521984 0.51071026]
mean value: 0.5202825852910407
key: test_accuracy
value: [0.6838488 0.62542955 0.60137457 0.58075601 0.64827586 0.64137931
0.63448276 0.63793103 0.62068966 0.63793103]
mean value: 0.6312098589880317
key: train_accuracy
value: [0.62648297 0.63872943 0.63872943 0.62954458 0.62777353 0.63963275
0.62930375 0.63006886 0.63198164 0.63848508]
mean value: 0.6330732014695518
key: test_roc_auc
value: [0.66745581 0.59627525 0.56336806 0.55997475 0.60698342 0.60926871
0.61405187 0.61166029 0.59283939 0.6107938 ]
mean value: 0.6032671339894835
key: train_roc_auc
value: [0.59807069 0.60871073 0.60898474 0.6083345 0.60386328 0.61120332
0.59982788 0.6066949 0.6035856 0.60741072]
mean value: 0.6056686372997443
key: test_jcc
value: [0.39869281 0.31446541 0.275 0.28654971 0.31543624 0.32467532
0.3375 0.33121019 0.3125 0.33121019]
mean value: 0.32272398753165554
key: train_jcc
value: [0.31652661 0.32619557 0.32667618 0.33195307 0.32571033 0.33001422
0.31760563 0.32893824 0.32205779 0.32403433]
mean value: 0.3249711976744908
MCC on Blind test: 0.07
Accuracy on Blind test: 0.71
Running classifier: 15
Model_name: Naive Bayes
Model func: BernoulliNB()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', BernoulliNB())])
key: fit_time
value: [0.02625918 0.0256815 0.02548575 0.02584147 0.02555537 0.02559757
0.02565074 0.02570558 0.02582574 0.0256176 ]
mean value: 0.025722050666809083
key: score_time
value: [0.01351094 0.01349545 0.01364303 0.0135119 0.01353884 0.01362228
0.01354051 0.01361918 0.0136404 0.01351738]
mean value: 0.013563990592956543
key: test_mcc
value: [0.2343489 0.14145456 0.09095094 0.11964787 0.16871829 0.11515365
0.20256216 0.17858111 0.04051884 0.24279934]
mean value: 0.153473565295904
key: train_mcc
value: [0.17140399 0.17995022 0.1824695 0.19234838 0.1846488 0.19222534
0.1839998 0.18137434 0.20020866 0.18352834]
mean value: 0.1852157364913661
key: test_fscore
value: [0.47567568 0.42105263 0.35632184 0.42926829 0.4180791 0.39784946
0.47474747 0.44086022 0.35602094 0.48387097]
mean value: 0.4253746597380349
key: train_fscore
value: [0.44249854 0.437046 0.44835681 0.46320593 0.45195108 0.44883303
0.45168801 0.44497041 0.45595238 0.43685174]
mean value: 0.4481353940517149
key: test_precision
value: [0.51162791 0.43956044 0.41333333 0.41509434 0.46835443 0.42045455
0.47 0.46590909 0.36956522 0.51724138]
mean value: 0.4491140682938191
key: train_precision
value: [0.45883777 0.47189542 0.46756426 0.46882217 0.46803378 0.4789272
0.46746988 0.46882793 0.48297604 0.47606383]
mean value: 0.47094182861517
key: test_recall
value: [0.44444444 0.4040404 0.31313131 0.44444444 0.37755102 0.37755102
0.47959184 0.41836735 0.34343434 0.45454545]
mean value: 0.40571016285302
key: train_recall
value: [0.42728298 0.40698985 0.43066516 0.45772266 0.43693694 0.4222973
0.43693694 0.42342342 0.43179256 0.40360767]
mean value: 0.4277655473963253
key: test_accuracy
value: [0.66666667 0.62199313 0.61512027 0.59793814 0.64482759 0.6137931
0.64137931 0.64137931 0.57586207 0.66896552]
mean value: 0.6287925109610144
key: train_accuracy
value: [0.63451971 0.64408726 0.64026024 0.63987754 0.6400153 0.64766641
0.63963275 0.64116297 0.6503443 0.6469013 ]
mean value: 0.6424467767688543
key: test_roc_auc
value: [0.61284722 0.5692077 0.54198232 0.56076389 0.57940051 0.55596301
0.60177509 0.58678784 0.51988471 0.61732508]
mean value: 0.5745937376219724
key: train_roc_auc
value: [0.58415134 0.58646132 0.58931868 0.59560525 0.59071644 0.5929563
0.59042675 0.58830499 0.59719333 0.58773319]
mean value: 0.5902867582633904
key: test_jcc
value: [0.31205674 0.26666667 0.21678322 0.27329193 0.26428571 0.24832215
0.31125828 0.28275862 0.21656051 0.31914894]
mean value: 0.2711132753000799
key: train_jcc
value: [0.28410795 0.2796282 0.28895613 0.30141054 0.29194883 0.28935185
0.29172932 0.28614916 0.29529684 0.27946916]
mean value: 0.2888047985554894
MCC on Blind test: 0.04
Accuracy on Blind test: 0.72
Running classifier: 16
Model_name: Passive Aggresive
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model',
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
key: fit_time
value: [0.03879809 0.05005789 0.03908944 0.04336405 0.05134606 0.03866816
0.04368925 0.04672503 0.04367566 0.06264853]
mean value: 0.04580621719360352
key: score_time
value: [0.01298237 0.0132246 0.01616979 0.01306129 0.01300573 0.01305056
0.01301384 0.01311469 0.01438022 0.013412 ]
mean value: 0.013541507720947265
key: test_mcc
value: [0.37466972 0.26942027 0.43755198 0.37094543 0.3116953 0.32743905
0.4544241 0.32601779 0.02787676 0.21845143]
mean value: 0.3118491815531195
key: train_mcc
value: [0.38589534 0.32060648 0.46416309 0.50372956 0.32269043 0.35375024
0.45950365 0.3347251 0.14849271 0.25840433]
mean value: 0.35519609371834504
key: test_fscore
value: [0.48648649 0.56962025 0.64186047 0.60273973 0.29059829 0.42857143
0.65454545 0.59259259 0.01980198 0.13207547]
mean value: 0.4418892148998618
key: train_fscore
value: [0.50185598 0.59028281 0.65800416 0.68080594 0.35044248 0.4644767
0.65752033 0.59664478 0.07158351 0.20779221]
mean value: 0.47794088959456554
key: test_precision
value: [0.73469388 0.41474654 0.59482759 0.55 0.89473684 0.71428571
0.59016393 0.44221106 0.5 1. ]
mean value: 0.6435665553630308
key: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
train_precision
value: [0.73478261 0.43506146 0.61041466 0.64264264 0.81818182 0.72209026
0.59907407 0.4412082 0.94285714 0.9122807 ]
mean value: 0.6858593570101983
key: test_recall
value: [0.36363636 0.90909091 0.6969697 0.66666667 0.17346939 0.30612245
0.73469388 0.89795918 0.01010101 0.07070707]
mean value: 0.4829416615130902
key: train_recall
value: [0.38105975 0.91770011 0.71364149 0.72378805 0.22297297 0.34234234
0.7286036 0.92117117 0.03720406 0.11724915]
mean value: 0.5105732705648152
key: test_accuracy
value: [0.73883162 0.53264605 0.73539519 0.70103093 0.7137931 0.72413793
0.73793103 0.58275862 0.65862069 0.68275862]
mean value: 0.6807903780068728
key: train_accuracy
value: [0.74320704 0.56754688 0.74818217 0.76961347 0.71920428 0.73182862
0.74215761 0.57689365 0.67253252 0.69663351]
mean value: 0.6967799751170578
key: test_roc_auc
value: [0.64796402 0.62381629 0.72608902 0.69270833 0.58152636 0.62181122
0.73713861 0.65991709 0.5024327 0.53535354]
mean value: 0.6328757173184998
key: train_roc_auc
value: [0.65518805 0.65265075 0.73978714 0.75847572 0.59874025 0.63727778
0.73886727 0.66046971 0.51802299 0.55572938]
mean value: 0.6515209037779333
key: test_jcc
value: [0.32142857 0.39823009 0.47260274 0.43137255 0.17 0.27272727
0.48648649 0.42105263 0.01 0.07070707]
mean value: 0.3054607410169559
key: train_jcc
value: [0.33498513 0.41872428 0.49031758 0.51607717 0.21244635 0.30248756
0.48978047 0.42515593 0.03712036 0.11594203]
mean value: 0.334303686487625
MCC on Blind test: 0.19
Accuracy on Blind test: 0.79
Running classifier: 17
Model_name: QDA
Model func: QuadraticDiscriminantAnalysis()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', QuadraticDiscriminantAnalysis())])
key: fit_time
value: [0.05744934 0.05873418 0.06077385 0.0618341 0.06086016 0.0612824
0.06052709 0.06140566 0.05926919 0.05900764]
mean value: 0.060114359855651854
key: score_time
value: [0.01447964 0.01436162 0.01456594 0.01446581 0.01448965 0.01433372
0.01455665 0.01438189 0.01448298 0.01454449]
mean value: 0.014466238021850587
key: test_mcc
value: [ 0.10179334 0.11244276 0.09564358 -0.02380827 0.08710175 0.12785877
0.14962833 0.08581904 0.09620214 0.05634697]
mean value: 0.08890284062460875
key: train_mcc
value: [0.13848134 0.13924364 0.14668047 0.15028213 0.15277281 0.14156106
0.14527181 0.14558126 0.14144036 0.14218843]
mean value: 0.14435033021446053
key: test_fscore
value: [0.51733333 0.51851852 0.51578947 0.5 0.51226158 0.5171504
0.52459016 0.51187335 0.5171504 0.51187335]
mean value: 0.5146540563255654
key: train_fscore
value: [0.52084557 0.52099853 0.52253314 0.52330383 0.52438664 0.52173913
0.52250662 0.52284114 0.52130473 0.52145797]
mean value: 0.522191729965203
key: test_precision
value: [0.35144928 0.35125448 0.34875445 0.33807829 0.34944238 0.34875445
0.35820896 0.34519573 0.35 0.34642857]
mean value: 0.3487566579633132
key: train_precision
value: [0.35212386 0.3522637 0.35366826 0.35437475 0.35551102 0.35294118
0.35364397 0.35409182 0.35254372 0.3526839 ]
mean value: 0.3533846170127185
key: test_recall
value: [0.97979798 0.98989899 0.98989899 0.95959596 0.95918367 1.
0.97959184 0.98979592 0.98989899 0.97979798]
mean value: 0.9817460317460316
key: train_recall
value: [1. 1. 1. 1. 0.99887387 1.
1. 0.99887387 1. 1. ]
mean value: 0.9997747747747748
key: test_accuracy
value: [0.37800687 0.37457045 0.36769759 0.34707904 0.38275862 0.36896552
0.4 0.36206897 0.36896552 0.36206897]
mean value: 0.37121815380969314
key: train_accuracy
value: [0.37543054 0.37581324 0.37964026 0.38155377 0.38446825 0.37719969
0.37911247 0.38064269 0.37681714 0.37719969]
mean value: 0.3787877749736398
key: test_roc_auc
value: [0.52375316 0.52359533 0.51838699 0.49542298 0.52386267 0.5234375
0.54187925 0.51573129 0.5185097 0.5108414 ]
mean value: 0.5195420276531205
key: train_roc_auc
value: [0.52723059 0.52752028 0.53041715 0.53186559 0.53362002 0.52838934
0.52983778 0.53072315 0.5283729 0.52866242]
mean value: 0.5296639206827891
key: test_jcc
value: [0.34892086 0.35 0.34751773 0.33333333 0.34432234 0.34875445
0.35555556 0.34397163 0.34875445 0.34397163]
mean value: 0.346510198622554
key: train_jcc
value: [0.35212386 0.3522637 0.35366826 0.35437475 0.35536859 0.35294118
0.35364397 0.35395052 0.35254372 0.3526839 ]
mean value: 0.35335624402144095
MCC on Blind test: 0.05
Accuracy on Blind test: 0.18
Running classifier: 18
Model_name: Random Forest
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model',
RandomForestClassifier(n_estimators=1000, random_state=42))])
key: fit_time
value: [8.17149138 8.06483102 8.05650711 8.10280323 8.13397312 8.16241908
8.13972807 8.16250181 8.13414979 8.1745882 ]
mean value: 8.130299282073974
key: score_time
value: [0.13387084 0.13302255 0.13308597 0.13306499 0.14384317 0.13221502
0.13193846 0.13179994 0.13504839 0.14332151]
mean value: 0.13512108325958253
key: test_mcc
value: [0.48011413 0.42794529 0.32890283 0.38661312 0.47839566 0.45589748
0.36777155 0.50874973 0.44891238 0.48307463]
mean value: 0.4366376803791348
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.62427746 0.6 0.5 0.58064516 0.62857143 0.59393939
0.54651163 0.63905325 0.61016949 0.63687151]
mean value: 0.5960039322698704
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.72972973 0.66666667 0.63076923 0.62068966 0.71428571 0.73134328
0.63513514 0.76056338 0.69230769 0.7125 ]
mean value: 0.6893990487930363
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.54545455 0.54545455 0.41414141 0.54545455 0.56122449 0.5
0.47959184 0.55102041 0.54545455 0.57575758]
mean value: 0.5263553906411049
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.7766323 0.75257732 0.71821306 0.73195876 0.77586207 0.76896552
0.73103448 0.78965517 0.76206897 0.77586207]
mean value: 0.7582829719161037
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
[0.72064394 0.70241477 0.64457071 0.68678977 0.72332058 0.703125
0.66948342 0.73123937 0.70990005 0.72766936]
mean value: 0.7019156970657533
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.45378151 0.42857143 0.33333333 0.40909091 0.45833333 0.42241379
0.376 0.46956522 0.43902439 0.46721311]
mean value: 0.42573270324268
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: 0.28
Accuracy on Blind test: 0.84
Running classifier: 19
Model_name: Random Forest2
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
n_estimators=1000, n_jobs=10, oob_score=True,
random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model',
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
n_estimators=1000, n_jobs=10,
oob_score=True, random_state=42))])
key: fit_time
value: [1.889925 1.92179084 1.91096258 1.96866465 1.96431351 1.88536286
1.88920355 1.94275665 1.87435317 1.91168523]
mean value: 1.9159018039703368
key: score_time
value: [0.38391089 0.36571622 0.37001991 0.4212532 0.34475899 0.27236319
0.36689782 0.36829352 0.34953547 0.15427041]
mean value: 0.3397019624710083
key: test_mcc
value: [0.47763066 0.43919728 0.34297979 0.38172733 0.48157515 0.48157515
0.38520265 0.46318028 0.46060421 0.47049916]
mean value: 0.43841716748918336
key: train_mcc
value: [0.83745677 0.83348672 0.83544739 0.83196529 0.83781825 0.84274397
0.83916237 0.83821432 0.81899171 0.83473673]
mean value: 0.8350023524024982
key: test_fscore
value: [0.61077844 0.59770115 0.5 0.57142857 0.6097561 0.6097561
0.5398773 0.5875 0.6035503 0.61627907]
mean value: 0.584662702532851
key: train_fscore
value: [0.88246154 0.87871287 0.88152241 0.87891825 0.8823167 0.88657265
0.88412017 0.88357843 0.86897404 0.88068881]
mean value: 0.8807865874837448
key: test_precision
value: [0.75 0.69333333 0.6557377 0.62650602 0.75757576 0.75757576
0.67692308 0.75806452 0.72857143 0.7260274 ]
mean value: 0.7130314996383078
key: train_precision
value: [0.97154472 0.9739369 0.96765499 0.96621622 0.97414966 0.9730821
0.97039031 0.96908602 0.96169631 0.96887686]
mean value: 0.9696634075622527
key: test_recall
value: [0.51515152 0.52525253 0.4040404 0.52525253 0.51020408 0.51020408
0.44897959 0.47959184 0.51515152 0.53535354]
mean value: 0.49691816120387555
key: train_recall
value: [0.80834273 0.80045096 0.80947012 0.80608794 0.80630631 0.81418919
0.81193694 0.81193694 0.79255919 0.80721533]
mean value: 0.806849563768955
key: test_accuracy
value: [0.7766323 0.75945017 0.72508591 0.73195876 0.77931034 0.77931034
0.74137931 0.77241379 0.76896552 0.77241379]
mean value: 0.76069202512146
key: train_accuracy
value: [0.92690394 0.92499043 0.92613854 0.92460773 0.92693191 0.92922724
0.92769702 0.92731446 0.91889824 0.92578424]
mean value: 0.925849374163846
key: test_roc_auc
value: [0.71330492 0.70273043 0.6473327 0.6818971 0.71343537 0.71343537
0.6698023 0.70073342 0.70783754 0.71532075]
mean value: 0.6965829898515015
key: train_roc_auc
value: [0.89808793 0.89472142 0.89778257 0.89580179 0.8976491 0.90130085
0.89959535 0.89930566 0.88817305 0.89694872]
mean value: 0.8969366454701836
key: test_jcc
value: [0.43965517 0.42622951 0.33333333 0.4 0.43859649 0.43859649
0.3697479 0.4159292 0.43220339 0.44537815]
mean value: 0.41396696401904876
key: train_jcc
value: [0.78964758 0.78366446 0.7881449 0.78399123 0.78941566 0.79625551
0.79230769 0.79143798 0.76830601 0.78681319]
mean value: 0.7869984192950908
MCC on Blind test: 0.29
Accuracy on Blind test: 0.84
Running classifier: 20
Model_name: Ridge Classifier
Model func: RidgeClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', RidgeClassifier(random_state=42))])
key: fit_time
value: [0.05016327 0.03894353 0.03653336 0.03661036 0.03645778 0.04881573
0.05472684 0.03619218 0.04787183 0.04198861]
mean value: 0.04283034801483154
key: score_time
value: [0.02012396 0.02980423 0.02965236 0.02027082 0.02027011 0.02027702
0.02027202 0.02968955 0.02966285 0.02030206]
mean value: 0.02403249740600586
key: test_mcc
value: [0.4620474 0.50550152 0.39683253 0.36971882 0.41332234 0.41793794
0.42712936 0.51259192 0.38544979 0.46912464]
mean value: 0.4359656248840572
key: train_mcc
value: [0.48404707 0.49210872 0.50717491 0.50145761 0.48619578 0.49600783
0.48618486 0.47794899 0.4925121 0.49970379]
mean value: 0.49233416606518643
key: test_fscore
value: [0.62637363 0.67010309 0.56647399 0.57591623 0.56626506 0.58139535
0.58959538 0.65536723 0.5698324 0.64210526]
mean value: 0.6043427619794597
key: train_fscore
value: [0.6372122 0.64259029 0.65427509 0.64968944 0.64211172 0.64508095
0.63659148 0.63450835 0.64528069 0.65021592]
mean value: 0.6437556122955159
key: test_precision
value: [0.68674699 0.68421053 0.66216216 0.59782609 0.69117647 0.67567568
0.68 0.73417722 0.6375 0.67032967]
mean value: 0.6719804795169736
key: train_precision
value: [0.71111111 0.71766342 0.72627235 0.72337483 0.70580297 0.72144847
0.71751412 0.7037037 0.71253406 0.71798365]
mean value: 0.7157408687867652
key: test_recall
value: [0.57575758 0.65656566 0.49494949 0.55555556 0.47959184 0.51020408
0.52040816 0.59183673 0.51515152 0.61616162]
mean value: 0.5516182230467944
key: train_recall
value: [0.57722661 0.58173619 0.59526494 0.58962796 0.58896396 0.58333333
0.57207207 0.5777027 0.58962796 0.59413754]
mean value: 0.5849693267111531
key: test_accuracy
value: [0.76632302 0.78006873 0.74226804 0.72164948 0.75172414 0.75172414
0.75517241 0.78965517 0.73448276 0.76551724]
mean value: 0.7558585140419481
key: train_accuracy
value: [0.77688481 0.78032912 0.78645235 0.78415614 0.77697016 0.78194338
0.77811783 0.77390972 0.7800306 0.78309105]
mean value: 0.7801885165427057
key: test_roc_auc
value: [0.72017045 0.75015783 0.68237058 0.68142361 0.68510842 0.69260204
0.69770408 0.74123087 0.68165953 0.72954678]
mean value: 0.7061974186787201
key: train_roc_auc
value: [0.72835838 0.73206161 0.73998473 0.73687655 0.73133019 0.73372924
0.72809861 0.72627893 0.73372539 0.73713826]
mean value: 0.7327581870582579
key: test_jcc
value: [0.456 0.50387597 0.39516129 0.40441176 0.39495798 0.40983607
0.41803279 0.48739496 0.3984375 0.47286822]
mean value: 0.4340976534710461
key: train_jcc
value: [0.46757991 0.4733945 0.48618785 0.48114075 0.47287523 0.47610294
0.46691176 0.46467391 0.47632058 0.48171846]
mean value: 0.47469058959569155
MCC on Blind test: 0.23
Accuracy on Blind test: 0.81
Running classifier: 21
Model_name: Ridge ClassifierCV
Model func: RidgeClassifierCV(cv=3)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', RidgeClassifierCV(cv=3))])
key: fit_time
value: [0.19880462 0.18814206 0.18500733 0.18421388 0.18452907 0.18459868
0.21274996 0.1886313 0.2256813 0.18453932]
mean value: 0.19368975162506102
key: score_time
value: [0.02031589 0.02028036 0.02027154 0.02015972 0.02010059 0.02066565
0.0200665 0.02011657 0.0313735 0.02014661]
mean value: 0.02134969234466553
key: test_mcc
value: [0.45136262 0.46563688 0.36892959 0.39331284 0.40983545 0.45003472
0.46810433 0.48489139 0.41294244 0.4512305 ]
mean value: 0.4356280767269798
key: train_mcc
value: [0.47102239 0.47812029 0.48289679 0.48964334 0.49072281 0.48716899
0.47313985 0.4736247 0.47936114 0.47363216]
mean value: 0.4799332468252029
key: test_fscore
value: [0.61452514 0.64248705 0.54117647 0.58378378 0.55 0.6
0.61627907 0.64480874 0.59340659 0.62765957]
mean value: 0.6014126421480467
key: train_fscore
value: [0.62562814 0.63210493 0.63591022 0.64320988 0.64360716 0.63744521
0.62688442 0.62828536 0.63341646 0.63066418]
mean value: 0.6337155972182651
key: test_precision
value: [0.6875 0.65957447 0.64788732 0.62790698 0.70967742 0.70833333
0.71621622 0.69411765 0.65060241 0.66292135]
mean value: 0.6764737142689327
key: train_precision
value: [0.70638298 0.70868347 0.71129707 0.71077763 0.7127223 0.71791255
0.70880682 0.70704225 0.70850767 0.70165746]
mean value: 0.7093790201666449
key: test_recall
value: [0.55555556 0.62626263 0.46464646 0.54545455 0.44897959 0.52040816
0.54081633 0.60204082 0.54545455 0.5959596 ]
mean value: 0.5445578231292518
key: train_recall
value: [0.56144307 0.57046223 0.57497182 0.58737317 0.58671171 0.5731982
0.56193694 0.56531532 0.57271702 0.57271702]
mean value: 0.572684649136171
key: test_accuracy
value: [0.7628866 0.7628866 0.73195876 0.73539519 0.75172414 0.76551724
0.77241379 0.77586207 0.74482759 0.75862069]
mean value: 0.7562092665007702
key: train_accuracy
value: [0.77190968 0.7745886 0.7765021 0.77879832 0.77926549 0.77850038
0.77276205 0.77276205 0.77505738 0.7723795 ]
mean value: 0.7752525554207657
key: test_roc_auc
value: [0.71267361 0.72979798 0.66721907 0.68939394 0.6776148 0.70551658
0.71572066 0.73331207 0.69681104 0.71944577]
mean value: 0.7047505520532821
key: train_roc_auc
value: [0.7207563 0.72497619 0.72752067 0.73227291 0.73252156 0.72866167
0.72158261 0.72240273 0.72584896 0.72382232]
mean value: 0.7260365914130487
key: test_jcc
value: [0.44354839 0.47328244 0.37096774 0.41221374 0.37931034 0.42857143
0.44537815 0.47580645 0.421875 0.45736434]
mean value: 0.4308318029596059
key: train_jcc
value: [0.45521024 0.46210046 0.46617916 0.47406733 0.47449909 0.46783088
0.45654163 0.4580292 0.46350365 0.4605621 ]
mean value: 0.4638523737491507
MCC on Blind test: 0.24
Accuracy on Blind test: 0.82
Running classifier: 22
Model_name: SVC
Model func: SVC(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', SVC(random_state=42))])
key: fit_time
value: [0.33256125 0.35639691 0.30511975 0.3100152 0.35889983 0.3830955
0.39610648 0.37718201 0.34542656 0.36768675]
mean value: 0.35324902534484864
key: score_time
value: [0.09091997 0.09007263 0.09228826 0.09248519 0.09392595 0.09213376
0.09354281 0.08505416 0.08378148 0.0877521 ]
mean value: 0.090195631980896
key: test_mcc
value: [0.45136262 0.46610348 0.39509323 0.32972727 0.46311439 0.41832274
0.43969885 0.44093203 0.38125751 0.42520697]
mean value: 0.42108190859292255
key: train_mcc
value: [0.50517047 0.50480376 0.51897932 0.53299705 0.50395439 0.51471387
0.50468646 0.49408188 0.51493687 0.50359148]
mean value: 0.5097915546891382
key: test_fscore
value: [0.61452514 0.62146893 0.56140351 0.54054054 0.57142857 0.55345912
0.58682635 0.59171598 0.56 0.59550562]
mean value: 0.5796873748070652
key: train_fscore
value: [0.64175258 0.63989637 0.65477707 0.66539683 0.64507937 0.64774194
0.63707572 0.63260026 0.64587394 0.64138817]
mean value: 0.6451582242074972
key: test_precision
value: [0.6875 0.70512821 0.66666667 0.58139535 0.78571429 0.72131148
0.71014493 0.70422535 0.64473684 0.67088608]
mean value: 0.6877709179459741
key: train_precision
value: [0.74887218 0.75190259 0.75256223 0.76162791 0.73944687 0.75830816
0.75776398 0.74316109 0.76226994 0.74588939]
mean value: 0.7521804323149177
key: test_recall
value: [0.55555556 0.55555556 0.48484848 0.50505051 0.44897959 0.44897959
0.5 0.51020408 0.49494949 0.53535354]
mean value: 0.5039476396619255
key: train_recall
value: [0.56144307 0.55693348 0.5794814 0.59075536 0.57207207 0.56531532
0.54954955 0.55067568 0.56031567 0.56257046]
mean value: 0.5649112048914755
key: test_accuracy
value: [0.7628866 0.76975945 0.74226804 0.70790378 0.77241379 0.75517241
0.76206897 0.76206897 0.73448276 0.75172414]
mean value: 0.7520748903898566
key: train_accuracy
value: [0.78721776 0.78721776 0.79257558 0.79831611 0.78615149 0.79112471
0.78729916 0.78270849 0.79150727 0.78653405]
mean value: 0.789065238225329
key: test_roc_auc
value: [0.71267361 0.71788194 0.67992424 0.65877525 0.6932398 0.68021896
0.69791667 0.70041454 0.67679412 0.69961394]
mean value: 0.6917453076145578
key: train_roc_auc
value: [0.73234378 0.73124774 0.74078357 0.74786899 0.73418204 0.73630772
0.72958358 0.72638071 0.73528233 0.73206693]
mean value: 0.734604738819103
key: test_jcc
value: [0.44354839 0.45081967 0.3902439 0.37037037 0.4 0.3826087
0.41525424 0.42016807 0.38888889 0.424 ]
mean value: 0.4085902221093406
key: train_jcc
value: [0.47248577 0.47047619 0.48674242 0.49857279 0.47610122 0.47900763
0.46743295 0.46263009 0.47696737 0.47209082]
mean value: 0.4762507251861604
MCC on Blind test: 0.25
Accuracy on Blind test: 0.81
Running classifier: 23
Model_name: Stochastic GDescent
Model func: SGDClassifier(n_jobs=10, random_state=42)
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
from pandas import MultiIndex, Int64Index
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
from pandas import MultiIndex, Int64Index
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
from pandas import MultiIndex, Int64Index
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
from pandas import MultiIndex, Int64Index
/home/tanu/git/LSHTM_analysis/scripts/ml/ml_functions/MultClfs_logo_skf.py:419: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
scoresDF_CV['source_data'] = 'CV'
/home/tanu/git/LSHTM_analysis/scripts/ml/ml_functions/MultClfs_logo_skf.py:446: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
scoresDF_BT['source_data'] = 'BT'
Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', SGDClassifier(n_jobs=10, random_state=42))])
key: fit_time
value: [0.06580234 0.10567546 0.12266302 0.11939096 0.09735942 0.09454536
0.08974719 0.11474299 0.08605075 0.08144641]
mean value: 0.09774239063262939
key: score_time
value: [0.01108456 0.01102352 0.01132727 0.01120424 0.01135111 0.01137471
0.01135421 0.01111817 0.01142669 0.01149249]
mean value: 0.011275696754455566
key: test_mcc
value: [0.26603961 0.35388804 0.39743042 0.36718607 0.33191313 0.42509124
0.47558039 0.45225787 0.37395074 0.40697521]
mean value: 0.38503127266895676
key: train_mcc
value: [0.27559833 0.3969122 0.48478187 0.50436457 0.32340921 0.4726856
0.49499561 0.46419604 0.47680294 0.47048421]
mean value: 0.43642305736311926
key: test_fscore
value: [0.21428571 0.60750853 0.54320988 0.59907834 0.59405941 0.64069264
0.66350711 0.65873016 0.53892216 0.57954545]
mean value: 0.5639539388868168
key: train_fscore
value: [0.25384615 0.62916834 0.61955086 0.67724289 0.58926692 0.67109929
0.67677824 0.66637324 0.61118509 0.61618123]
mean value: 0.60106922547061
key: test_precision
value: [0.92307692 0.45876289 0.6984127 0.55084746 0.43902439 0.55639098
0.61946903 0.53896104 0.66176471 0.66233766]
mean value: 0.6109047767131917
key: train_precision
value: [0.8627451 0.48876404 0.74800638 0.65781084 0.42791645 0.55336257
0.63183594 0.54696532 0.74634146 0.72340426]
mean value: 0.638715236375957
key: test_recall
value: [0.12121212 0.8989899 0.44444444 0.65656566 0.91836735 0.75510204
0.71428571 0.84693878 0.45454545 0.51515152]
mean value: 0.632560296846011
key: train_recall
value: [0.14881623 0.88275085 0.52874859 0.69785795 0.94594595 0.85247748
0.7286036 0.85247748 0.51747463 0.53664036]
mean value: 0.6691793117807774
key: test_accuracy
value: [0.6975945 0.604811 0.74570447 0.70103093 0.57586207 0.7137931
0.75517241 0.70344828 0.73448276 0.74482759]
mean value: 0.6976727100367343
key: train_accuracy
value: [0.70302334 0.64676617 0.77956372 0.77420589 0.55202754 0.71614384
0.76358072 0.71002295 0.77658761 0.77314461]
mean value: 0.7195066395993666
key: test_roc_auc
value: [0.55800189 0.67605745 0.67274306 0.69026199 0.65970451 0.72390519
0.74516369 0.73857355 0.6670633 0.68951293]
mean value: 0.6820987566254488
key: train_roc_auc
value: [0.56832469 0.70412166 0.71860373 0.75564972 0.64765432 0.7492399
0.75508975 0.7446049 0.71357229 0.71562765]
mean value: 0.7072488596885611
key: test_jcc
value: [0.12 0.43627451 0.37288136 0.42763158 0.42253521 0.47133758
0.4964539 0.49112426 0.36885246 0.408 ]
mean value: 0.40150908556495757
key: train_jcc
value: [0.14537445 0.45896835 0.44880383 0.51199338 0.41770264 0.50500334
0.51146245 0.49966997 0.4400767 0.44527596]
mean value: 0.43843310563751314
MCC on Blind test: 0.21
Accuracy on Blind test: 0.67
Running classifier: 24
Model_name: XGBoost
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
colsample_bynode=None, colsample_bytree=None,
enable_categorical=False, gamma=None, gpu_id=None,
importance_type=None, interaction_constraints=None,
learning_rate=None, max_delta_step=None, max_depth=None,
min_child_weight=None, missing=nan, monotone_constraints=None,
n_estimators=100, n_jobs=None, num_parallel_tree=None,
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
scale_pos_weight=None, subsample=None, tree_method=None,
use_label_encoder=False, validate_parameters=None, verbosity=0)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', '...
interaction_constraints=None, learning_rate=None,
max_delta_step=None, max_depth=None,
min_child_weight=None, missing=nan,
monotone_constraints=None, n_estimators=100,
n_jobs=None, num_parallel_tree=None,
predictor=None, random_state=42, reg_alpha=None,
reg_lambda=None, scale_pos_weight=None,
subsample=None, tree_method=None,
use_label_encoder=False,
validate_parameters=None, verbosity=0))])
key: fit_time
value: [0.38441825 0.36125016 0.36491537 0.51286745 0.34814119 0.35783768
0.36107588 0.35301995 0.50380468 0.40338802]
mean value: 0.3950718641281128
key: score_time
value: [0.01203132 0.01277709 0.01189399 0.01234293 0.01186442 0.01211429
0.01202226 0.01210642 0.01289511 0.01205826]
mean value: 0.012210607528686523
key: test_mcc
value: [0.47473495 0.494513 0.40693149 0.34463472 0.49922344 0.33432729
0.40512041 0.43800172 0.4221095 0.52893647]
mean value: 0.434853299485225
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.64171123 0.65968586 0.59016393 0.56410256 0.6519337 0.52325581
0.57954545 0.60227273 0.61052632 0.68717949]
mean value: 0.6110377092747753
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.68181818 0.68478261 0.64285714 0.57291667 0.71084337 0.60810811
0.65384615 0.67948718 0.63736264 0.69791667]
mean value: 0.6569938719002365
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.60606061 0.63636364 0.54545455 0.55555556 0.60204082 0.45918367
0.52040816 0.54081633 0.58585859 0.67676768]
mean value: 0.5728509585652442
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.76975945 0.7766323 0.74226804 0.70790378 0.78275862 0.71724138
0.74482759 0.75862069 0.74482759 0.78965517]
mean value: 0.753449460836592
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: [0.73011364 0.74266098 0.69460227 0.67100694 0.73852041 0.654071
0.68989158 0.705304 0.70654186 0.76246761]
mean value: 0.709518029457142
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.47244094 0.4921875 0.41860465 0.39285714 0.48360656 0.35433071
0.408 0.43089431 0.43939394 0.5234375 ]
mean value: 0.44157532532773186
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: 0.21
Accuracy on Blind test: 0.79
Extracting tts_split_name: logo_skf_BT_embb
Total cols in each df:
CV df: 8
metaDF: 17
Adding column: Model_name
Total cols in bts df:
BT_df: 8
First proceeding to rowbind CV and BT dfs:
Final output should have: 25 columns
Combinig 2 using pd.concat by row ~ rowbind
Checking Dims of df to combine:
Dim of CV: (24, 8)
Dim of BT: (24, 8)
8
Number of Common columns: 8
These are: ['Precision', 'Accuracy', 'source_data', 'F1', 'Recall', 'MCC', 'ROC_AUC', 'JCC']
Concatenating dfs with different resampling methods [WF]:
Split type: logo_skf_BT_embb
No. of dfs combining: 2
PASS: 2 dfs successfully combined
nrows in combined_df_wf: 48
ncols in combined_df_wf: 8
PASS: proceeding to merge metadata with CV and BT dfs
Adding column: Model_name
=========================================================
SUCCESS: Ran multiple classifiers
=======================================================
BTS gene: katg
Total genes: 6
Training on: 4
Training on genes: ['pnca', 'gid', 'rpob', 'embb']
Omitted genes: ['alr', 'katg']
Blind test gene: katg
/home/tanu/git/Data/ml_combined/5genes_logo_skf_BT_katg.csv
Training data dim: (2945, 171)
Training Target dim: (2945,)
Checked training df does NOT have Target var
TEST data dim: (817, 171)
TEST Target dim: (817,)
==============================================================
Running several classification models (n): 24
List of models:
('AdaBoost Classifier', AdaBoostClassifier(random_state=42))
('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42))
('Decision Tree', DecisionTreeClassifier(random_state=42))
('Extra Tree', ExtraTreeClassifier(random_state=42))
('Extra Trees', ExtraTreesClassifier(random_state=42))
('Gradient Boosting', GradientBoostingClassifier(random_state=42))
('Gaussian NB', GaussianNB())
('Gaussian Process', GaussianProcessClassifier(random_state=42))
('K-Nearest Neighbors', KNeighborsClassifier())
('LDA', LinearDiscriminantAnalysis())
('Logistic Regression', LogisticRegression(random_state=42))
('Logistic RegressionCV', LogisticRegressionCV(cv=3, random_state=42))
('MLP', MLPClassifier(max_iter=500, random_state=42))
('Multinomial', MultinomialNB())
('Naive Bayes', BernoulliNB())
('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42))
('QDA', QuadraticDiscriminantAnalysis())
('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42))
('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
n_estimators=1000, n_jobs=10, oob_score=True,
random_state=42))
('Ridge Classifier', RidgeClassifier(random_state=42))
('Ridge ClassifierCV', RidgeClassifierCV(cv=3))
('SVC', SVC(random_state=42))
('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42))
('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
colsample_bynode=None, colsample_bytree=None,
enable_categorical=False, gamma=None, gpu_id=None,
importance_type=None, interaction_constraints=None,
learning_rate=None, max_delta_step=None, max_depth=None,
min_child_weight=None, missing=nan, monotone_constraints=None,
n_estimators=100, n_jobs=None, num_parallel_tree=None,
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
scale_pos_weight=None, subsample=None, tree_method=None,
use_label_encoder=False, validate_parameters=None, verbosity=0))
================================================================
Running classifier: 1
Model_name: AdaBoost Classifier
Model func: AdaBoostClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', AdaBoostClassifier(random_state=42))])
key: fit_time
value: [0.7302177 0.70884395 0.74518824 0.69031 0.71068215 0.70188713
0.6448102 0.64597631 0.6451807 0.64957929]
mean value: 0.6872675657272339
key: score_time
value: [0.01827192 0.01902556 0.0194788 0.02015972 0.01942563 0.01855612
0.01821113 0.01822186 0.01847959 0.01830912]
mean value: 0.018813943862915038
key: test_mcc
value: [0.37316484 0.40410487 0.57303435 0.39003179 0.42821942 0.57866439
0.45809185 0.50856647 0.40082443 0.52839126]
mean value: 0.46430936669642603
key: train_mcc
value: [0.56159149 0.57489431 0.56266775 0.55917167 0.56536167 0.54895323
0.56598317 0.55808426 0.56349642 0.56842857]
mean value: 0.5628632529517112
key: test_fscore
value: [0.5 0.51968504 0.66165414 0.46428571 0.54263566 0.67625899
0.58571429 0.61068702 0.51968504 0.6259542 ]
mean value: 0.5706560087173034
key: train_fscore
value: [0.64655172 0.65748709 0.65017065 0.6429192 0.65306122 0.6380789
0.65183918 0.64740867 0.65316456 0.65414176]
mean value: 0.649482295226657
key: test_precision
value: [0.59259259 0.63461538 0.75862069 0.7027027 0.64814815 0.72307692
0.62121212 0.70175439 0.62264151 0.71929825]
mean value: 0.6724662703015954
key: train_precision
value: [0.76530612 0.77484787 0.75745527 0.76763485 0.75739645 0.75
0.76352705 0.75147929 0.75145631 0.76447106]
mean value: 0.7603574278110011
key: test_recall
value: [0.43243243 0.44 0.58666667 0.34666667 0.46666667 0.63513514
0.55405405 0.54054054 0.44594595 0.55405405]
mean value: 0.5002162162162163
key: train_recall
value: [0.55970149 0.57100149 0.56950673 0.55306428 0.57399103 0.55522388
0.56865672 0.56865672 0.57761194 0.57164179]
mean value: 0.5669056064966647
key: test_accuracy
value: [0.78305085 0.79322034 0.84745763 0.79661017 0.8 0.84693878
0.80272109 0.82653061 0.79251701 0.83333333]
mean value: 0.8122379799377379
key: train_accuracy
value: [0.84528302 0.84981132 0.84528302 0.84490566 0.84603774 0.84081479
0.84647303 0.8434553 0.84496416 0.84722746]
mean value: 0.8454255496323922
key: test_roc_auc
value: [0.66644246 0.67681818 0.76151515 0.64833333 0.69015152 0.77665848
0.72020885 0.73163391 0.67751843 0.74066339]
mean value: 0.7089943689061335
key: train_roc_auc
value: [0.75081034 0.75748459 0.75396083 0.74826359 0.75595059 0.74631462
0.75454542 0.75252624 0.75649905 0.75603796]
mean value: 0.7532393231673611
key: test_jcc
value: [0.33333333 0.35106383 0.49438202 0.30232558 0.37234043 0.51086957
0.41414141 0.43956044 0.35106383 0.45555556]
mean value: 0.4024635996781775
key: train_jcc
value: [0.47770701 0.48974359 0.48166877 0.4737516 0.48484848 0.46851385
0.48350254 0.47864322 0.48496241 0.48604061]
mean value: 0.48093820783856805
MCC on Blind test: 0.25
Accuracy on Blind test: 0.62
Running classifier: 2
Model_name: Bagging Classifier
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model',
BaggingClassifier(n_jobs=10, oob_score=True,
random_state=42))])
key: fit_time
value: [0.31460404 0.34694791 0.37534928 0.29799151 0.38209033 0.37807369
0.3838594 0.37467265 0.38038754 0.38300848]
mean value: 0.36169848442077634
key: score_time
value: [0.04638076 0.0249536 0.03375554 0.03102374 0.04002213 0.04501009
0.05068159 0.03894639 0.0460434 0.04319644]
mean value: 0.04000136852264404
key: test_mcc
value: [0.38440334 0.45717737 0.43571876 0.27761568 0.39674162 0.57905199
0.33966056 0.51703179 0.37152981 0.5256284 ]
mean value: 0.42845593167355406
key: train_mcc
value: [0.9558985 0.95886776 0.95190501 0.96087699 0.95691977 0.95600527
0.94180684 0.96000756 0.95897205 0.94699654]
mean value: 0.954825629072794
key: test_fscore
value: [0.46428571 0.54545455 0.52892562 0.40983607 0.52631579 0.66666667
0.47244094 0.61538462 0.48780488 0.59322034]
mean value: 0.5310335178587429
key: train_fscore
value: [0.96625767 0.96858238 0.96290572 0.97016067 0.96689761 0.96610169
0.95510836 0.96927803 0.96853415 0.95888285]
mean value: 0.9652709144464662
key: test_precision
value: [0.68421053 0.7173913 0.69565217 0.53191489 0.60344828 0.75862069
0.56603774 0.71428571 0.6122449 0.79545455]
mean value: 0.6679260757259422
key: train_precision
value: [0.99369085 0.99371069 0.9968 0.99373041 0.9968254 0.99840764
0.99196141 0.99841772 0.99684044 0.99838449]
mean value: 0.9958769060982682
key: test_recall
value: [0.35135135 0.44 0.42666667 0.33333333 0.46666667 0.59459459
0.40540541 0.54054054 0.40540541 0.47297297]
mean value: 0.4436936936936937
key: train_recall
value: [0.94029851 0.94469357 0.93124066 0.94768311 0.9387145 0.9358209
0.92089552 0.94179104 0.94179104 0.92238806]
mean value: 0.9365316913191888
key: test_accuracy
value: [0.79661017 0.81355932 0.80677966 0.7559322 0.78644068 0.85034014
0.77210884 0.82993197 0.78571429 0.83673469]
mean value: 0.8034151965871095
key: train_accuracy
value: [0.98339623 0.9845283 0.98188679 0.98528302 0.98377358 0.98340249
0.97812146 0.98491135 0.98453414 0.98000754]
mean value: 0.9829844914343464
key: test_roc_auc
value: [0.64852635 0.69045455 0.68151515 0.61666667 0.68106061 0.76547912
0.65042998 0.73390663 0.65952088 0.71603194]
mean value: 0.6843591874474229
key: train_roc_auc
value: [0.96913915 0.9713372 0.96511553 0.97283196 0.96885245 0.96765805
0.95918577 0.97064312 0.97039073 0.96094163]
mean value: 0.9676095604450055
key: test_jcc
value: [0.30232558 0.375 0.35955056 0.25773196 0.35714286 0.5
0.30927835 0.44444444 0.32258065 0.42168675]
mean value: 0.3649741146207996
key: train_jcc
value: [0.9347181 0.93907875 0.92846498 0.94205052 0.93591654 0.93442623
0.91407407 0.94038748 0.9389881 0.92101341]
mean value: 0.9329118185934367
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
MCC on Blind test: 0.2
Accuracy on Blind test: 0.61
Running classifier: 3
Model_name: Decision Tree
Model func: DecisionTreeClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', DecisionTreeClassifier(random_state=42))])
key: fit_time
value: [0.17836046 0.18272161 0.18890953 0.17057157 0.20805979 0.17924666
0.20484114 0.19440317 0.19911289 0.19116879]
mean value: 0.18973956108093262
key: score_time
value: [0.01015973 0.01012325 0.01047587 0.01040292 0.01018357 0.01158905
0.0102005 0.01019883 0.01007247 0.01020861]
mean value: 0.010361480712890624
key: test_mcc
value: [0.37688667 0.33282025 0.33851984 0.30363308 0.35048913 0.27798427
0.30200869 0.30833602 0.26175944 0.39872742]
mean value: 0.3251164809121244
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.5248227 0.50955414 0.51282051 0.48717949 0.525 0.47560976
0.48 0.46715328 0.45454545 0.54794521]
mean value: 0.498463053595685
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.55223881 0.48780488 0.49382716 0.4691358 0.49411765 0.43333333
0.47368421 0.50793651 0.4375 0.55555556]
mean value: 0.49051339013924283
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.5 0.53333333 0.53333333 0.50666667 0.56 0.52702703
0.48648649 0.43243243 0.47297297 0.54054054]
mean value: 0.5092792792792793
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.77288136 0.73898305 0.74237288 0.72881356 0.74237288 0.70748299
0.73469388 0.75170068 0.71428571 0.7755102 ]
mean value: 0.7409097198201314
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: [0.6821267 0.67121212 0.67348485 0.65560606 0.68227273 0.64760442
0.65233415 0.64576167 0.63421376 0.697543 ]
mean value: 0.664215945686534
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.35576923 0.34188034 0.34482759 0.3220339 0.3559322 0.312
0.31578947 0.3047619 0.29411765 0.37735849]
mean value: 0.3324470776622361
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: 0.16
Accuracy on Blind test: 0.59
Running classifier: 4
Model_name: Extra Tree
Model func: ExtraTreeClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', ExtraTreeClassifier(random_state=42))])
key: fit_time
value: [0.02005243 0.02048969 0.02070951 0.02061391 0.02082157 0.02103257
0.02098823 0.02074504 0.01993299 0.02078414]
mean value: 0.020617008209228516
key: score_time
value: [0.00989008 0.00990033 0.0098381 0.00997305 0.00996017 0.0098927
0.00999093 0.00985622 0.00989842 0.00988412]
mean value: 0.009908413887023926
key: test_mcc
value: [0.12361079 0.24889279 0.17362284 0.29688912 0.21856923 0.18835901
0.191318 0.23469869 0.23520697 0.17462309]
mean value: 0.20857905279121097
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.33566434 0.45398773 0.39751553 0.47682119 0.44318182 0.40506329
0.4025974 0.42465753 0.44025157 0.38095238]
mean value: 0.41606927851734377
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.34782609 0.42045455 0.37209302 0.47368421 0.38613861 0.38095238
0.3875 0.43055556 0.41176471 0.38356164]
mean value: 0.3994530766280489
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.32432432 0.49333333 0.42666667 0.48 0.52 0.43243243
0.41891892 0.41891892 0.47297297 0.37837838]
mean value: 0.43659459459459454
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.6779661 0.69830508 0.67118644 0.73220339 0.66779661 0.68027211
0.68707483 0.71428571 0.69727891 0.69047619]
mean value: 0.6916845382220684
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: [0.56035221 0.63075758 0.59060606 0.64909091 0.61909091 0.5980344
0.59809582 0.61627764 0.62285012 0.58691646]
mean value: 0.6072072109130933
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.20168067 0.29365079 0.24806202 0.31304348 0.28467153 0.25396825
0.25203252 0.26956522 0.28225806 0.23529412]
mean value: 0.26342266663791114
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: 0.13
Accuracy on Blind test: 0.58
Running classifier: 5
Model_name: Extra Trees
Model func: ExtraTreesClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', ExtraTreesClassifier(random_state=42))])
key: fit_time
value: [0.47257233 0.45381427 0.46976399 0.46076918 0.47586417 0.46682477
0.45304871 0.46457815 0.47684526 0.45131421]
mean value: 0.4645395040512085
key: score_time
value: [0.02542353 0.02521467 0.02605391 0.02512789 0.02545762 0.02472568
0.02454424 0.0260582 0.02436233 0.02408648]
mean value: 0.02510545253753662
key: test_mcc
value: [0.37885579 0.28177605 0.45484476 0.30764599 0.37304232 0.47178338
0.35716922 0.48946203 0.38097203 0.46968947]
mean value: 0.3965241048432712
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.44444444 0.37837838 0.53781513 0.37735849 0.46551724 0.55462185
0.43636364 0.54867257 0.47863248 0.54700855]
mean value: 0.476881275793443
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.70588235 0.58333333 0.72727273 0.64516129 0.65853659 0.73333333
0.66666667 0.79487179 0.65116279 0.74418605]
mean value: 0.6910406921316768
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.32432432 0.28 0.42666667 0.26666667 0.36 0.44594595
0.32432432 0.41891892 0.37837838 0.43243243]
mean value: 0.3657657657657658
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.79661017 0.76610169 0.81355932 0.77627119 0.78983051 0.81972789
0.78911565 0.82653061 0.79251701 0.81972789]
mean value: 0.798999192897498
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: [0.63953773 0.60590909 0.68606061 0.60833333 0.64818182 0.69570025
0.63488943 0.69127764 0.65509828 0.69121622]
mean value: 0.6556204394439689
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.28571429 0.23333333 0.36781609 0.23255814 0.30337079 0.38372093
0.27906977 0.37804878 0.31460674 0.37647059]
mean value: 0.31547094450239305
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: 0.16
Accuracy on Blind test: 0.59
Running classifier: 6
Model_name: Gradient Boosting
Model func: GradientBoostingClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', GradientBoostingClassifier(random_state=42))])
key: fit_time
value: [3.23792315 3.16934085 3.22660422 3.14063549 3.16357064 3.14768457
3.13296199 3.18318343 3.13546348 3.18564844]
mean value: 3.1723016262054444
key: score_time
value: [0.01148772 0.01076007 0.01133776 0.010566 0.01024485 0.01057506
0.01066756 0.01037526 0.01046538 0.01051784]
mean value: 0.010699748992919922
key: test_mcc
value: [0.41320015 0.44496551 0.50874514 0.43864596 0.47171869 0.61164527
0.44118885 0.58431603 0.43774967 0.60470398]
mean value: 0.49568792543094364
key: train_mcc
value: [0.69134893 0.69430167 0.68753983 0.69542372 0.69456672 0.684635
0.71039334 0.7072122 0.69243715 0.69355963]
mean value: 0.6951418179940524
key: test_fscore
value: [0.5 0.5511811 0.60465116 0.50877193 0.58015267 0.6962963
0.56060606 0.66141732 0.52542373 0.67716535]
mean value: 0.586566562961446
key: train_fscore
value: [0.74255692 0.74629468 0.74015748 0.74738676 0.74270557 0.73591549
0.76017316 0.75524476 0.74543875 0.74652778]
mean value: 0.7462401344722867
key: test_precision
value: [0.69047619 0.67307692 0.72222222 0.74358974 0.67857143 0.7704918
0.63793103 0.79245283 0.70454545 0.81132075]
mean value: 0.722467838514907
key: train_precision
value: [0.89830508 0.89539749 0.89240506 0.89561587 0.90909091 0.89699571
0.90515464 0.91139241 0.89189189 0.89211618]
mean value: 0.898836523991343
key: test_recall
value: [0.39189189 0.46666667 0.52 0.38666667 0.50666667 0.63513514
0.5 0.56756757 0.41891892 0.58108108]
mean value: 0.49745945945945946
key: train_recall
value: [0.63283582 0.63976084 0.632287 0.64125561 0.62780269 0.6238806
0.65522388 0.64477612 0.64029851 0.64179104]
mean value: 0.6379912098699327
key: test_accuracy
value: [0.80338983 0.80677966 0.82711864 0.81016949 0.81355932 0.86054422
0.80272109 0.8537415 0.80952381 0.86054422]
mean value: 0.8248091779084514
key: train_accuracy
value: [0.8890566 0.89018868 0.88792453 0.89056604 0.89018868 0.88683516
0.89551113 0.89437948 0.88947567 0.88985289]
mean value: 0.8903978847426746
key: test_roc_auc
value: [0.66653418 0.69469697 0.72590909 0.67060606 0.71242424 0.78574939
0.70227273 0.75878378 0.679914 0.76781327]
mean value: 0.7164703714409597
key: train_roc_auc
value: [0.8042967 0.80726053 0.80327121 0.80800791 0.80330064 0.79982521
0.81600164 0.81178735 0.80702457 0.80777084]
mean value: 0.8068546598964949
key: test_jcc
value: [0.33333333 0.38043478 0.43333333 0.34117647 0.40860215 0.53409091
0.38947368 0.49411765 0.35632184 0.51190476]
mean value: 0.4182788911746712
key: train_jcc
value: [0.59052925 0.59527121 0.5875 0.59666203 0.5907173 0.5821727
0.61312849 0.60674157 0.59418283 0.59556787]
mean value: 0.5952473247225339
MCC on Blind test: 0.21
Accuracy on Blind test: 0.6
Running classifier: 7
Model_name: Gaussian NB
Model func: GaussianNB()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', GaussianNB())])
key: fit_time
value: [0.01931047 0.01911187 0.01920557 0.0189414 0.01900911 0.01909041
0.01881194 0.01872849 0.01935077 0.01941633]
mean value: 0.01909763813018799
key: score_time
value: [0.01022291 0.01011586 0.01031899 0.00995064 0.01083064 0.00997758
0.00991368 0.00994992 0.01013088 0.01016474]
mean value: 0.010157585144042969
key: test_mcc
value: [0.20848702 0.22403224 0.32232427 0.23836565 0.28361813 0.33869048
0.16523505 0.24462166 0.29097083 0.27128816]
mean value: 0.25876334990238203
key: train_mcc
value: [0.26507355 0.27172605 0.25983463 0.26304115 0.2556744 0.26231608
0.27323846 0.2640639 0.25388265 0.25522282]
mean value: 0.26240737020689336
key: test_fscore
value: [0.44670051 0.46 0.52083333 0.46632124 0.5 0.53061224
0.42654028 0.46875 0.5 0.48387097]
mean value: 0.4803628581470948
key: train_fscore
value: [0.48454188 0.48881432 0.48045326 0.48240636 0.4779661 0.48213273
0.48888889 0.48481502 0.47762864 0.47888889]
mean value: 0.4826536071785976
key: test_precision
value: [0.35772358 0.368 0.42735043 0.38135593 0.38686131 0.42622951
0.32846715 0.38135593 0.40163934 0.40178571]
mean value: 0.3860768902890995
key: train_precision
value: [0.38863841 0.39052726 0.38686131 0.38883806 0.38419619 0.38883806
0.39539171 0.38475022 0.38193202 0.38141593]
mean value: 0.38713891642325304
key: test_recall
value: [0.59459459 0.61333333 0.66666667 0.6 0.70666667 0.7027027
0.60810811 0.60810811 0.66216216 0.60810811]
mean value: 0.637045045045045
key: train_recall
value: [0.64328358 0.65321375 0.63378176 0.63527653 0.632287 0.63432836
0.64029851 0.65522388 0.63731343 0.64328358]
mean value: 0.6408290386631863
key: test_accuracy
value: [0.63050847 0.63389831 0.68813559 0.65084746 0.64067797 0.68707483
0.58843537 0.65306122 0.66666667 0.67346939]
mean value: 0.6512775279603367
key: train_accuracy
value: [0.65396226 0.65509434 0.65396226 0.65584906 0.65132075 0.65560166
0.66163712 0.64805734 0.64768012 0.64617126]
mean value: 0.652933617075792
key: test_roc_auc
value: [0.61856427 0.62712121 0.68106061 0.63409091 0.66242424 0.69226044
0.59496314 0.63814496 0.66517199 0.65178133]
mean value: 0.6465583102641926
key: train_roc_auc
value: [0.65042967 0.65447159 0.64727957 0.64903655 0.6450178 0.64856246
0.65457631 0.6504287 0.64424985 0.64521574]
mean value: 0.6489268256354188
key: test_jcc
value: [0.2875817 0.2987013 0.35211268 0.30405405 0.33333333 0.36111111
0.27108434 0.30612245 0.33333333 0.31914894]
mean value: 0.3166583228435076
key: train_jcc
value: [0.31973294 0.3234641 0.31618195 0.31787584 0.31403118 0.31763827
0.32352941 0.31997085 0.3137399 0.31482834]
mean value: 0.31809927762587653
MCC on Blind test: 0.23
Accuracy on Blind test: 0.62
Running classifier: 8
Model_name: Gaussian Process
Model func: GaussianProcessClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', GaussianProcessClassifier(random_state=42))])
key: fit_time
value: [3.0116446 3.1437993 3.22845554 3.22139907 3.10136342 3.27681184
3.22066808 3.12913656 3.13234878 3.26802135]
mean value: 3.173364853858948
key: score_time
value: [0.09579086 0.08934426 0.08900642 0.10104299 0.08988047 0.08922553
0.0892539 0.08983684 0.08920169 0.08946157]
mean value: 0.09120445251464844
key: test_mcc
value: [0.13763638 0.23828091 0.34023783 0.2907997 0.32548643 0.32641543
0.30085787 0.29283735 0.21682656 0.25925928]
mean value: 0.27286377329418626
key: train_mcc
value: [0.57758243 0.5696958 0.56114285 0.56481798 0.5570662 0.56628047
0.57678249 0.56262179 0.57437783 0.55652243]
mean value: 0.566689028102272
key: test_fscore
value: [0.17777778 0.24175824 0.37623762 0.29787234 0.36 0.30769231
0.34343434 0.28571429 0.25531915 0.26373626]
mean value: 0.2909542333237298
key: train_fscore
value: [0.58541667 0.57563025 0.56507937 0.56962025 0.56144068 0.56871036
0.5830721 0.56415695 0.57594937 0.56084656]
mean value: 0.5709922548551499
key: test_precision
value: [0.5 0.6875 0.73076923 0.73684211 0.72 0.82352941
0.68 0.76470588 0.6 0.70588235]
mean value: 0.6949228983091211
key: train_precision
value: [0.96896552 0.96819788 0.9673913 0.96774194 0.96363636 0.97463768
0.97212544 0.97435897 0.98201439 0.96363636]
mean value: 0.9702705843752133
key: test_recall
value: [0.10810811 0.14666667 0.25333333 0.18666667 0.24 0.18918919
0.22972973 0.17567568 0.16216216 0.16216216]
mean value: 0.18536936936936937
key: train_recall
value: [0.41940299 0.40956652 0.39910314 0.40358744 0.3961136 0.40149254
0.41641791 0.39701493 0.40746269 0.39552239]
mean value: 0.4045684135376927
key: test_accuracy
value: [0.74915254 0.76610169 0.78644068 0.77627119 0.78305085 0.78571429
0.77891156 0.77891156 0.76190476 0.77210884]
mean value: 0.7738567969560706
key: train_accuracy
value: [0.84981132 0.84754717 0.84490566 0.84603774 0.84377358 0.84609581
0.84949076 0.84496416 0.84835911 0.8434553 ]
mean value: 0.846444061692633
key: test_roc_auc
value: [0.53595451 0.5619697 0.61075758 0.5819697 0.60409091 0.58777641
0.59668305 0.57874693 0.56289926 0.56971744]
mean value: 0.5790565481153717
key: train_roc_auc
value: [0.70742877 0.70251168 0.69727999 0.69952214 0.69553282 0.69897948
0.70618977 0.69674068 0.70246935 0.69523722]
mean value: 0.7001891904777169
key: test_jcc
value: [0.09756098 0.1375 0.23170732 0.175 0.2195122 0.18181818
0.20731707 0.16666667 0.14634146 0.15189873]
mean value: 0.17153226070523075
key: train_jcc
value: [0.41384389 0.40412979 0.39380531 0.39823009 0.39027982 0.39734121
0.41150442 0.3929099 0.40444444 0.38970588]
mean value: 0.39961947624854216
MCC on Blind test: 0.13
Accuracy on Blind test: 0.57
Running classifier: 9
Model_name: K-Nearest Neighbors
Model func: KNeighborsClassifier()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', KNeighborsClassifier())])
key: fit_time
value: [0.02128553 0.01805282 0.01670074 0.01797724 0.01869297 0.01643968
0.01865721 0.01635051 0.01848054 0.01834154]
mean value: 0.018097877502441406
key: score_time
value: [0.04753232 0.02716708 0.0296855 0.02861929 0.02539062 0.0271709
0.02848625 0.02621961 0.0288856 0.02959919]
mean value: 0.029875636100769043
key: test_mcc
value: [0.12919235 0.18664136 0.2342087 0.30195325 0.28674843 0.18132367
0.26425359 0.30754545 0.16790654 0.21056998]
mean value: 0.2270343322549358
key: train_mcc
value: [0.45292107 0.46654155 0.46188916 0.44395419 0.44395419 0.45668928
0.43730198 0.43696385 0.47892073 0.45725259]
mean value: 0.45363885985577285
key: test_fscore
value: [0.27826087 0.30630631 0.34234234 0.44094488 0.38938053 0.31578947
0.3826087 0.36893204 0.32786885 0.35294118]
mean value: 0.35053751681780215
key: train_fscore
value: [0.53053435 0.54320988 0.53152279 0.52115385 0.52115385 0.52895753
0.51631478 0.51858913 0.54368932 0.53561254]
mean value: 0.5290738010136599
key: test_precision
value: [0.3902439 0.47222222 0.52777778 0.53846154 0.57894737 0.45
0.53658537 0.65517241 0.41666667 0.46666667]
mean value: 0.5032743922301711
key: train_precision
value: [0.73544974 0.74479167 0.75690608 0.73045822 0.73045822 0.74863388
0.72311828 0.7176781 0.77777778 0.73629243]
mean value: 0.7401564387104362
key: test_recall
value: [0.21621622 0.22666667 0.25333333 0.37333333 0.29333333 0.24324324
0.2972973 0.25675676 0.27027027 0.28378378]
mean value: 0.27142342342342346
key: train_recall
value: [0.41492537 0.42750374 0.40956652 0.40508221 0.40508221 0.40895522
0.40149254 0.40597015 0.41791045 0.42089552]
mean value: 0.4117383932356157
key: test_accuracy
value: [0.71864407 0.73898305 0.75254237 0.75932203 0.76610169 0.73469388
0.7585034 0.77891156 0.72108844 0.73809524]
mean value: 0.7466885737345786
key: train_accuracy
value: [0.81433962 0.81849057 0.81773585 0.81207547 0.81207547 0.81591852
0.80988306 0.80950585 0.82270841 0.81554131]
mean value: 0.814827412937802
key: test_roc_auc
value: [0.55154702 0.57015152 0.5880303 0.63212121 0.61030303 0.57162162
0.60546683 0.60565111 0.57149877 0.58734644]
mean value: 0.5893737849326085
key: train_roc_auc
value: [0.68221016 0.68901689 0.68257225 0.67730133 0.67730133 0.68125702
0.6747493 0.67597851 0.6887634 0.68495559]
mean value: 0.6814105771061847
key: test_jcc
value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
[0.16161616 0.18085106 0.20652174 0.28282828 0.24175824 0.1875
0.23655914 0.22619048 0.19607843 0.21428571]
mean value: 0.21341892507965937
key: train_jcc
value: [0.36103896 0.37288136 0.36195509 0.35240572 0.35240572 0.35958005
0.34799483 0.35006435 0.37333333 0.36575875]
mean value: 0.35974181623801443
MCC on Blind test: 0.1
Accuracy on Blind test: 0.57
Running classifier: 10
Model_name: LDA
Model func: LinearDiscriminantAnalysis()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', LinearDiscriminantAnalysis())])
key: fit_time
value: [0.10488629 0.10437632 0.10457969 0.10531783 0.10501885 0.10565329
0.10635376 0.10648179 0.12217259 0.10428119]
mean value: 0.10691215991973876
key: score_time
value: [0.02103496 0.01327944 0.01316905 0.01319456 0.0131731 0.01316476
0.01338291 0.01328635 0.02573609 0.01315784]
mean value: 0.015257906913757325
key: test_mcc
value: [0.41895076 0.29305776 0.4915699 0.33303217 0.49793912 0.53128137
0.37594631 0.47257551 0.39282767 0.49402933]
mean value: 0.4301209895515245
key: train_mcc
value: [0.50937865 0.51483821 0.49418745 0.52245182 0.51483821 0.50703708
0.51731727 0.49822197 0.5283673 0.51662378]
mean value: 0.51232617190761
key: test_fscore
value: [0.54814815 0.44274809 0.59541985 0.39622642 0.60740741 0.63157895
0.496 0.58646617 0.515625 0.59375 ]
mean value: 0.5413370022363148
key: train_fscore
value: [0.60331299 0.60801394 0.59323504 0.6202209 0.60801394 0.59929701
0.61125541 0.59298246 0.62134251 0.61139896]
mean value: 0.6069073160934974
key: test_precision
value: [0.60655738 0.51785714 0.69642857 0.67741935 0.68333333 0.71186441
0.60784314 0.66101695 0.61111111 0.7037037 ]
mean value: 0.6477135087508857
key: train_precision
value: [0.72536688 0.72860125 0.70661157 0.71850394 0.72860125 0.72863248
0.72783505 0.71914894 0.73373984 0.72540984]
mean value: 0.7242451028598318
key: test_recall
value: [0.5 0.38666667 0.52 0.28 0.54666667 0.56756757
0.41891892 0.52702703 0.44594595 0.51351351]
mean value: 0.47063063063063054
key: train_recall
value: [0.51641791 0.52167414 0.51121076 0.54559043 0.52167414 0.50895522
0.52686567 0.50447761 0.53880597 0.52835821]
mean value: 0.5224030073846017
key: test_accuracy
value: [0.79322034 0.75254237 0.82033898 0.78305085 0.82033898 0.83333333
0.78571429 0.81292517 0.78911565 0.82312925]
mean value: 0.8013709212498558
key: train_accuracy
value: [0.82830189 0.83018868 0.82301887 0.83132075 0.83018868 0.82798944
0.83062995 0.82497171 0.8340249 0.83025273]
mean value: 0.829088759670612
key: test_roc_auc
value: [0.69570136 0.6319697 0.72136364 0.61727273 0.73015152 0.74514742
0.66400491 0.71805897 0.6752457 0.72039312]
mean value: 0.6919309056073762
key: train_roc_auc
value: [0.72512815 0.72802536 0.7197649 0.73670233 0.72802536 0.72242309
0.73011633 0.7189223 0.73633888 0.7303578 ]
mean value: 0.7275804496383846
key: test_jcc
value: [0.37755102 0.28431373 0.42391304 0.24705882 0.43617021 0.46153846
0.32978723 0.41489362 0.34736842 0.42222222]
mean value: 0.3744816781549135
key: train_jcc
value: [0.43196005 0.43679599 0.4217016 0.44950739 0.43679599 0.42785445
0.44014963 0.42144638 0.45068664 0.44029851]
mean value: 0.43571966453858224
MCC on Blind test: 0.25
Accuracy on Blind test: 0.63
Running classifier: 11
Model_name: Logistic Regression
Model func: LogisticRegression(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', LogisticRegression(random_state=42))])
key: fit_time
value: [0.06459475 0.06384015 0.06299591 0.06626678 0.0678122 0.06761789
0.07208109 0.07282257 0.07325649 0.07462168]
mean value: 0.06859095096588134
key: score_time
value: [0.01315284 0.01536894 0.01640964 0.01685429 0.01511908 0.01510692
0.0150547 0.01497221 0.01495647 0.01500201]
mean value: 0.015199708938598632
key: test_mcc
value: [0.42305513 0.30696714 0.50397173 0.3743661 0.51427011 0.54552594
0.3925598 0.47061474 0.41734265 0.47943715]
mean value: 0.44281104857231535
key: train_mcc
value: [0.50783987 0.51013623 0.48846793 0.50426513 0.4861372 0.47953542
0.49024967 0.46801292 0.50466889 0.49393465]
mean value: 0.49332478938983104
key: test_fscore
value: [0.50434783 0.42975207 0.592 0.44036697 0.61654135 0.64705882
0.50406504 0.57142857 0.528 0.576 ]
mean value: 0.5409560653671572
key: train_fscore
value: [0.59205776 0.59768064 0.57685353 0.59521701 0.57247037 0.56700091
0.58064516 0.55868972 0.59285714 0.58614565]
mean value: 0.5819617892146394
key: test_precision
value: [0.70731707 0.56521739 0.74 0.70588235 0.70689655 0.70967742
0.63265306 0.69230769 0.64705882 0.70588235]
mean value: 0.6812892718498004
key: train_precision
value: [0.74885845 0.74115044 0.72997712 0.73043478 0.73364486 0.72833724
0.7264574 0.71561772 0.73777778 0.72368421]
mean value: 0.7315939988651952
key: test_recall
value: [0.39189189 0.34666667 0.49333333 0.32 0.54666667 0.59459459
0.41891892 0.48648649 0.44594595 0.48648649]
mean value: 0.4530990990990992
key: train_recall
value: [0.48955224 0.50074738 0.47683109 0.50224215 0.46935725 0.4641791
0.48358209 0.45820896 0.49552239 0.49253731]
mean value: 0.48327599669812377
key: test_accuracy
value: [0.80677966 0.76610169 0.82711864 0.79322034 0.82711864 0.83673469
0.79251701 0.81632653 0.79931973 0.81972789]
mean value: 0.8084964833390984
key: train_accuracy
value: [0.82943396 0.82981132 0.82339623 0.82754717 0.82301887 0.82082233
0.82346284 0.81705017 0.82798944 0.82421728]
mean value: 0.8246749606769962
key: test_roc_auc
value: [0.66879662 0.62787879 0.71712121 0.63727273 0.73469697 0.75638821
0.66855037 0.70687961 0.68206388 0.70915233]
mean value: 0.6908800719683073
key: train_roc_auc
value: [0.71699834 0.72084315 0.70863261 0.71982375 0.70590528 0.70281141
0.71099852 0.69831195 0.71797826 0.71446654]
mean value: 0.711676979783803
key: test_jcc
value: [0.3372093 0.27368421 0.42045455 0.28235294 0.44565217 0.47826087
0.33695652 0.4 0.35869565 0.40449438]
mean value: 0.373776059889669
key: train_jcc
value: [0.42051282 0.42620865 0.40533672 0.42370744 0.40102171 0.3956743
0.40909091 0.38762626 0.4213198 0.41457286]
mean value: 0.4105071478355362
MCC on Blind test: 0.23
Accuracy on Blind test: 0.62
Running classifier: 12
Model_name: Logistic RegressionCV
Model func: LogisticRegressionCV(cv=3, random_state=42)
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
warnings.warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
warnings.warn(
Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', LogisticRegressionCV(cv=3, random_state=42))])
key: fit_time
value: [0.88080454 0.81086469 0.89211535 0.83820128 0.81877232 0.87818956
0.81196332 0.77954078 0.91016746 0.79513884]
mean value: 0.841575813293457
key: score_time
value: [0.01342177 0.01660609 0.01793766 0.01593876 0.01318741 0.01358008
0.01634169 0.01353312 0.01361513 0.01347923]
mean value: 0.014764094352722168
key: test_mcc
value: [0. 0.32016428 0.51761564 0.34998685 0.51142112 0.
0.41358979 0. 0. 0. ]
mean value: 0.211277766928235
key: train_mcc
value: [0. 0.516548 0.48796588 0.50911496 0.47320133 0.
0.48327541 0. 0. 0. ]
mean value: 0.2470105582543641
key: test_fscore
value: [0. 0.46153846 0.609375 0.42201835 0.61068702 0.
0.5203252 0. 0. 0. ]
mean value: 0.262394403631511
key: train_fscore
value: [0. 0.60839161 0.57785779 0.5994695 0.55378859 0.
0.56749311 0. 0. 0. ]
mean value: 0.29070005906039853
key: test_precision
value: [0. 0.54545455 0.73584906 0.67647059 0.71428571 0.
0.65306122 0. 0. 0. ]
mean value: 0.3325121129069123
key: train_precision
value: [0. 0.73263158 0.72624434 0.73376623 0.74 0.
0.73747017 0. 0. 0. ]
mean value: 0.3670112323669444
key: test_recall
value: [0. 0.4 0.52 0.30666667 0.53333333 0.
0.43243243 0. 0. 0. ]
mean value: 0.21924324324324324
key: train_recall
value: [0. 0.52017937 0.47982063 0.50672646 0.44245142 0.
0.46119403 0. 0. 0. ]
mean value: 0.24103719072797447
key: test_accuracy
value: [0.74915254 0.76271186 0.83050847 0.78644068 0.82711864 0.74829932
0.79931973 0.74829932 0.74829932 0.74829932]
mean value: 0.7748449210192552
key: train_accuracy
value: [0.74716981 0.8309434 0.82301887 0.8290566 0.82 0.74726518
0.8223312 0.74726518 0.74726518 0.74726518]
mean value: 0.7861580606819782
key: test_roc_auc
value: [0.5 0.64318182 0.72818182 0.62833333 0.73030303 0.5
0.67757985 0.5 0.5 0.5 ]
mean value: 0.5907579852579852
key: train_roc_auc
value: [0.5 0.72803517 0.70937018 0.7223183 0.69497634 0.5
0.70283326 0.5 0.5 0.5 ]
mean value: 0.6057533252983638
key: test_jcc
value: [0. 0.3 0.43820225 0.26744186 0.43956044 0.
0.35164835 0. 0. 0. ]
mean value: 0.17968528988649188
key: train_jcc
value: [0. 0.43718593 0.40632911 0.4280303 0.38292367 0.
0.39615385 0. 0. 0. ]
mean value: 0.20506228667538534
MCC on Blind test: 0.27
Accuracy on Blind test: 0.64
Running classifier: 13
Model_name: MLP
Model func: MLPClassifier(max_iter=500, random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', MLPClassifier(max_iter=500, random_state=42))])
key: fit_time
value: [ 6.84585714 3.55296183 11.72970223 7.30736828 6.33841872 5.86001587
7.5640099 4.33539867 4.1961894 5.11789918]
mean value: 6.284782123565674
key: score_time
value: [0.01354027 0.01360893 0.01668406 0.01360846 0.01427078 0.01387954
0.01376796 0.01361585 0.01463485 0.0138793 ]
mean value: 0.014148998260498046
key: test_mcc
value: [0.41112218 0.34036648 0.39134394 0.34602881 0.47806688 0.54251158
0.3520294 0.47378899 0.38817941 0.5322362 ]
mean value: 0.42556738755013546
key: train_mcc
value: [0.68946636 0.5957485 0.72258255 0.67039049 0.64823844 0.63911642
0.68971422 0.59087888 0.60819169 0.59381161]
mean value: 0.6448139158860531
key: test_fscore
value: [0.55782313 0.4964539 0.54901961 0.46774194 0.61538462 0.64179104
0.50359712 0.578125 0.53521127 0.61904762]
mean value: 0.5564195242404074
key: train_fscore
value: [0.76761619 0.68855084 0.79150872 0.73452078 0.72245236 0.71114865
0.765204 0.6828479 0.69944489 0.67070009]
mean value: 0.723399441246177
key: test_precision
value: [0.56164384 0.53030303 0.53846154 0.59183673 0.59259259 0.71666667
0.53846154 0.68518519 0.55882353 0.75 ]
mean value: 0.6063974651392632
key: train_precision
value: [0.77108434 0.74137931 0.80307692 0.84901961 0.81040892 0.81906615
0.79014308 0.74558304 0.74619289 0.79671458]
mean value: 0.7872668843993739
key: test_recall
value: [0.55405405 0.46666667 0.56 0.38666667 0.64 0.58108108
0.47297297 0.5 0.51351351 0.52702703]
mean value: 0.5201981981981982
key: train_recall
value: [0.7641791 0.64275037 0.78026906 0.64723468 0.65171898 0.62835821
0.74179104 0.62985075 0.65820896 0.57910448]
mean value: 0.6723465631483837
key: test_accuracy
value: [0.77966102 0.75932203 0.76610169 0.77627119 0.79661017 0.83673469
0.76530612 0.81632653 0.7755102 0.83673469]
mean value: 0.7908578346592874
key: train_accuracy
value: [0.88301887 0.85320755 0.89622642 0.88188679 0.87358491 0.87099208
0.88494908 0.85213127 0.85703508 0.85628065]
mean value: 0.8709312683714939
key: test_roc_auc
value: [0.70462884 0.66287879 0.69818182 0.64787879 0.745 0.75190418
0.66830467 0.71136364 0.68857494 0.73396806]
mean value: 0.7012683710036651
key: train_roc_auc
value: [0.84370571 0.78351552 0.85782761 0.80418271 0.80011492 0.79070611
0.83757902 0.77858009 0.79124481 0.76456486]
mean value: 0.8052021365041068
key: test_jcc
value: [0.38679245 0.33018868 0.37837838 0.30526316 0.44444444 0.47252747
0.33653846 0.40659341 0.36538462 0.44827586]
mean value: 0.38743869309059525
key: train_jcc
value: [0.62287105 0.52503053 0.65495609 0.58042895 0.56549935 0.55176933
0.61970075 0.51842752 0.53780488 0.50455137]
mean value: 0.5681039804095791
MCC on Blind test: 0.2
Accuracy on Blind test: 0.61
Running classifier: 14
Model_name: Multinomial
Model func: MultinomialNB()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', MultinomialNB())])
key: fit_time
value: [0.02839279 0.02372956 0.02335167 0.02343678 0.02368808 0.02374101
0.02365994 0.02343178 0.02377105 0.02355695]
mean value: 0.024075961112976073
key: score_time
value: [0.01326966 0.01320291 0.01325464 0.01327276 0.01324463 0.01318908
0.01314855 0.01317644 0.01328397 0.013165 ]
mean value: 0.013220763206481934
key: test_mcc
value: [0.11407935 0.179521 0.28519226 0.24348528 0.33883235 0.33181818
0.11599433 0.24414908 0.07899263 0.18402107]
mean value: 0.21160855335360956
key: train_mcc
value: [0.22951736 0.22495001 0.22399222 0.22399222 0.21288097 0.2147098
0.22340106 0.20714823 0.23088702 0.22798394]
mean value: 0.22194628343623496
key: test_fscore
value: [0.34210526 0.37762238 0.44444444 0.43708609 0.50340136 0.5
0.3483871 0.44155844 0.31081081 0.35658915]
mean value: 0.40620050349144343
key: train_fscore
value: [0.4260355 0.42276423 0.4124031 0.4124031 0.40791476 0.39645447
0.41666667 0.40601504 0.42483171 0.42192192]
mean value: 0.4147410507402632
key: test_precision
value: [0.33333333 0.39705882 0.5 0.43421053 0.51388889 0.5
0.33333333 0.425 0.31081081 0.41818182]
mean value: 0.4165817534393386
key: train_precision
value: [0.42228739 0.41812865 0.42834138 0.42834138 0.41550388 0.43082312
0.42307692 0.40909091 0.42578711 0.4244713 ]
mean value: 0.4225852045741593
key: test_recall
value: [0.35135135 0.36 0.4 0.44 0.49333333 0.5
0.36486486 0.45945946 0.31081081 0.31081081]
mean value: 0.3990630630630631
key: train_recall
value: [0.42985075 0.42750374 0.39760837 0.39760837 0.40059791 0.36716418
0.41044776 0.40298507 0.4238806 0.41940299]
mean value: 0.40770497289338065
key: test_accuracy
value: [0.66101695 0.69830508 0.74576271 0.71186441 0.75254237 0.74829932
0.65646259 0.70748299 0.65306122 0.71768707]
mean value: 0.7052484722702641
key: train_accuracy
value: [0.70716981 0.70528302 0.71396226 0.71396226 0.70641509 0.71746511
0.70954357 0.70199925 0.70992078 0.70954357]
mean value: 0.7095264727443543
key: test_roc_auc
value: [0.55802862 0.58681818 0.63181818 0.62227273 0.66712121 0.66590909
0.55970516 0.62518428 0.53949631 0.58267813]
mean value: 0.6039031892855423
key: train_roc_auc
value: [0.61543042 0.61329755 0.60920297 0.60920297 0.60514499 0.60155281
0.61057471 0.6030574 0.61527195 0.61353794]
mean value: 0.609627373122637
key: test_jcc
value: [0.20634921 0.23275862 0.28571429 0.27966102 0.33636364 0.33333333
0.2109375 0.28333333 0.184 0.21698113]
mean value: 0.2569432064808075
key: train_jcc
value: [0.27067669 0.26804124 0.25976562 0.25976562 0.25621415 0.24723618
0.26315789 0.25471698 0.2697056 0.26736441]
mean value: 0.2616644402637688
MCC on Blind test: 0.17
Accuracy on Blind test: 0.6
Running classifier: 15
Model_name: Naive Bayes
Model func: BernoulliNB()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', BernoulliNB())])
key: fit_time
value: [0.02633309 0.0260129 0.02624393 0.03456163 0.02594566 0.02618027
0.02598882 0.02596068 0.02599239 0.02958465]
mean value: 0.027280402183532716
key: score_time
value: [0.01359844 0.01357055 0.01358676 0.01360321 0.01363039 0.01353931
0.0135231 0.013556 0.01360488 0.04774213]
mean value: 0.0169954776763916
key: test_mcc
value: [0.13088523 0.15894099 0.18540636 0.21445857 0.12833152 0.21805284
0.15214379 0.23101946 0.23105914 0.27836615]
mean value: 0.19286640418314907
key: train_mcc
value: [0.2254126 0.2328189 0.23032357 0.22072662 0.214093 0.22168292
0.22647005 0.21663104 0.20950281 0.22153627]
mean value: 0.22191977765981458
key: test_fscore
value: [0.28813559 0.33070866 0.32478632 0.35294118 0.31818182 0.3559322
0.30508475 0.39694656 0.384 0.39655172]
mean value: 0.34532688122523625
key: train_fscore
value: [0.38245614 0.38658429 0.36837209 0.37275986 0.35826408 0.36380425
0.37745975 0.36429872 0.36133695 0.36547291]
mean value: 0.3700809040355889
key: test_precision
value: [0.38636364 0.40384615 0.45238095 0.47727273 0.36842105 0.47727273
0.40909091 0.45614035 0.47058824 0.54761905]
mean value: 0.4448995792649043
key: train_precision
value: [0.46382979 0.47198276 0.48768473 0.46532438 0.46859903 0.47699758
0.47098214 0.46728972 0.4576659 0.47494033]
mean value: 0.47052963727175123
key: test_recall
value: [0.22972973 0.28 0.25333333 0.28 0.28 0.28378378
0.24324324 0.35135135 0.32432432 0.31081081]
mean value: 0.2836576576576577
key: train_recall
value: [0.32537313 0.32735426 0.29596413 0.31091181 0.28998505 0.29402985
0.31492537 0.29850746 0.29850746 0.29701493]
mean value: 0.3052573455591995
key: test_accuracy
value: [0.71525424 0.71186441 0.73220339 0.73898305 0.69491525 0.7414966
0.72108844 0.73129252 0.73809524 0.76190476]
mean value: 0.7287097890003459
key: train_accuracy
value: [0.73433962 0.73773585 0.74377358 0.73584906 0.73773585 0.74009808
0.73745756 0.73670313 0.73330819 0.73934364]
mean value: 0.737634456203782
key: test_roc_auc
value: [0.55377889 0.56954545 0.57439394 0.58772727 0.55818182 0.58961916
0.56253071 0.60522113 0.60079853 0.61222359]
mean value: 0.5814020497255792
key: train_roc_auc
value: [0.5990502 0.60183967 0.59548332 0.59513284 0.58946501 0.59249701
0.59764441 0.59170704 0.58943546 0.59297995]
mean value: 0.5945234917415565
key: test_jcc
value: [0.16831683 0.19811321 0.19387755 0.21428571 0.18918919 0.21649485
0.18 0.24761905 0.23762376 0.24731183]
mean value: 0.20928319770387488
key: train_jcc
value: [0.23644252 0.23960613 0.22576967 0.22907489 0.21822272 0.22234763
0.23263506 0.22271715 0.22050717 0.22359551]
mean value: 0.22709184362961535
MCC on Blind test: 0.14
Accuracy on Blind test: 0.58
Running classifier: 16
Model_name: Passive Aggresive
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model',
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
key: fit_time
value: [0.05746698 0.05136132 0.04383612 0.04672456 0.042202 0.04387569
0.04366302 0.05650306 0.0423429 0.04361653]
mean value: 0.047159218788146974
key: score_time
value: [0.01197648 0.01316428 0.01307797 0.01306987 0.01311755 0.01308727
0.0131371 0.01321793 0.01313305 0.01312017]
mean value: 0.013010168075561523
key: test_mcc
value: [0.31866388 0.2539387 0.4483867 0.14150178 0.41754355 0.27514236
0.35874876 0.29001863 0.34457124 0.29562423]
mean value: 0.31441398323161124
key: train_mcc
value: [0.36416044 0.27827969 0.37397883 0.1230035 0.4222551 0.25822655
0.43145292 0.31379177 0.45902221 0.34967496]
mean value: 0.33738459768679346
key: test_fscore
value: [0.51785714 0.22727273 0.46601942 0.05194805 0.5203252 0.25
0.47540984 0.49805447 0.5034965 0.31578947]
mean value: 0.38261728307601417
key: train_fscore
value: [0.54554455 0.25345044 0.41003272 0.05763689 0.46812957 0.21502591
0.50536585 0.5118525 0.59426848 0.381798 ]
mean value: 0.39431049095947585
key: test_precision
value: [0.38666667 0.76923077 0.85714286 1. 0.66666667 0.78571429
0.60416667 0.34972678 0.52173913 0.71428571]
mean value: 0.6655339532764692
key: train_precision
value: [0.40814815 0.7890625 0.75806452 0.8 0.77777778 0.81372549
0.72957746 0.36256219 0.60060976 0.74458874]
mean value: 0.6784116586780802
key: test_recall
value: [0.78378378 0.13333333 0.32 0.02666667 0.42666667 0.14864865
0.39189189 0.86486486 0.48648649 0.2027027 ]
mean value: 0.37850450450450446
key: train_recall
value: [0.82238806 0.1509716 0.28101644 0.02989537 0.3348281 0.1238806
0.38656716 0.87014925 0.5880597 0.25671642]
mean value: 0.3844472703745844
key: test_accuracy
value: [0.63389831 0.76949153 0.81355932 0.75254237 0.8 0.7755102
0.78231293 0.56122449 0.7585034 0.77891156]
mean value: 0.7425954110457742
key: train_accuracy
value: [0.65358491 0.7754717 0.79584906 0.75320755 0.80792453 0.77140702
0.80875141 0.58053565 0.79705771 0.78989061]
mean value: 0.7533680134943739
key: test_roc_auc
value: [0.6837471 0.55984848 0.65090909 0.51333333 0.6769697 0.56750614
0.65276413 0.66197789 0.66824324 0.58771499]
mean value: 0.6223014089778796
key: train_roc_auc
value: [0.70942635 0.56867106 0.62536435 0.51368569 0.65126059 0.55714474
0.6690534 0.6763669 0.72790163 0.61346674]
mean value: 0.631234146662468
key: test_jcc
value: [0.34939759 0.12820513 0.30379747 0.02666667 0.35164835 0.14285714
0.31182796 0.33160622 0.3364486 0.1875 ]
mean value: 0.24699551208298343
key: train_jcc
value: [0.37508509 0.14511494 0.25788752 0.02967359 0.30559345 0.12046444
0.3381201 0.3439528 0.42274678 0.23593964]
mean value: 0.25745783661287225
MCC on Blind test: 0.13
Accuracy on Blind test: 0.58
Running classifier: 17
Model_name: QDA
Model func: QuadraticDiscriminantAnalysis()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', QuadraticDiscriminantAnalysis())])
key: fit_time
value: [0.05979133 0.06046247 0.0620656 0.06313896 0.06192613 0.09895945
0.08710527 0.07560062 0.05861163 0.06159997]
mean value: 0.06892614364624024
key: score_time
value: [0.01452422 0.01430082 0.01440382 0.01435018 0.01442909 0.017663
0.02010632 0.01443243 0.01493669 0.01451802]
mean value: 0.015366458892822265
key: test_mcc
value: [0.03998715 0.02047976 0.02333219 0.09970375 0.09102739 0.04849752
0.06324054 0.05681473 0.13515522 0.04882711]
mean value: 0.06270653606159562
key: train_mcc
value: [0.12162839 0.12150691 0.11978293 0.11745143 0.12035988 0.11926925
0.12100106 0.1169268 0.11211831 0.12100106]
mean value: 0.11910460047192746
key: test_fscore
value: [0.40336134 0.40449438 0.40555556 0.41690141 0.41322314 0.4056338
0.40793201 0.40687679 0.42196532 0.40555556]
mean value: 0.40914993095163377
key: train_fscore
value: [0.41757557 0.41708229 0.41669262 0.41617418 0.41682243 0.41692595
0.41731548 0.41640771 0.41537508 0.41731548]
mean value: 0.416768678472755
key: test_precision
value: [0.25441696 0.25622776 0.25614035 0.26428571 0.26041667 0.25622776
0.25806452 0.25818182 0.26838235 0.25524476]
mean value: 0.25875886514713337
key: train_precision
value: [0.26388342 0.26348956 0.2631786 0.26276512 0.26328217 0.26336478
0.26367572 0.26295133 0.26212833 0.26367572]
mean value: 0.26323947513544826
key: test_recall
value: [0.97297297 0.96 0.97333333 0.98666667 1. 0.97297297
0.97297297 0.95945946 0.98648649 0.98648649]
mean value: 0.9771351351351351
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.2779661 0.28135593 0.27457627 0.29830508 0.2779661 0.28231293
0.28911565 0.29591837 0.31972789 0.27210884]
mean value: 0.2869353164994811
key: train_accuracy
value: [0.29471698 0.29433962 0.29320755 0.29169811 0.29358491 0.29309694
0.29422859 0.29158808 0.28857035 0.29422859]
mean value: 0.2929259731108945
key: test_roc_auc
value: [0.50911092 0.505 0.50484848 0.52515152 0.51590909 0.51148649
0.51603194 0.51609337 0.54097052 0.50915233]
mean value: 0.5153754655519363
key: train_roc_auc
value: [0.5280303 0.52801615 0.52725896 0.52624937 0.52751136 0.52700656
0.52776376 0.52599697 0.52397779 0.52776376]
mean value: 0.5269574977437168
key: test_jcc
value: [0.25263158 0.25352113 0.2543554 0.2633452 0.26041667 0.25441696
0.25622776 0.25539568 0.26739927 0.2543554 ]
mean value: 0.25720650394882283
key: train_jcc
value: [0.26388342 0.26348956 0.2631786 0.26276512 0.26328217 0.26336478
0.26367572 0.26295133 0.26212833 0.26367572]
mean value: 0.26323947513544826
MCC on Blind test: 0.14
Accuracy on Blind test: 0.49
Running classifier: 18
Model_name: Random Forest
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model',
RandomForestClassifier(n_estimators=1000, random_state=42))])
key: fit_time
value: [8.52501249 8.80194497 8.83696699 8.47504067 8.23531604 8.42968321
8.34944677 8.34569097 8.47139978 8.30373311]
mean value: 8.477423501014709
key: score_time
value: [0.13654065 0.14214897 0.1433177 0.13267255 0.13087225 0.13163233
0.1419208 0.13635802 0.1303761 0.13124824]
mean value: 0.13570876121520997
key: test_mcc
value: [0.45604029 0.46680864 0.55520606 0.39874535 0.46247232 0.59277156
0.51311676 0.47850165 0.39430558 0.5247362 ]
mean value: 0.4842704389411443
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.52631579 0.55 0.62903226 0.4587156 0.56 0.66129032
0.57391304 0.54385965 0.46846847 0.5862069 ]
mean value: 0.5557802024070381
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.75 0.73333333 0.79591837 0.73529412 0.7 0.82
0.80487805 0.775 0.7027027 0.80952381]
mean value: 0.7626650379334331
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.40540541 0.44 0.52 0.33333333 0.46666667 0.55405405
0.44594595 0.41891892 0.35135135 0.45945946]
mean value: 0.4395135135135135
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.81694915 0.81694915 0.8440678 0.8 0.81355932 0.85714286
0.83333333 0.82312925 0.79931973 0.83673469]
mean value: 0.8241185287674393
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: [0.68007827 0.69272727 0.73727273 0.64621212 0.69924242 0.75657248
0.70479115 0.68900491 0.65067568 0.71154791]
mean value: 0.6968124951360245
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.35714286 0.37931034 0.45882353 0.29761905 0.38888889 0.4939759
0.40243902 0.37349398 0.30588235 0.41463415]
mean value: 0.38722100710811
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: 0.15
Accuracy on Blind test: 0.58
Running classifier: 19
Model_name: Random Forest2
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
n_estimators=1000, n_jobs=10, oob_score=True,
random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model',
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
n_estimators=1000, n_jobs=10,
oob_score=True, random_state=42))])
key: fit_time
value: [1.88341975 1.93829131 1.8873353 1.91797018 1.92363787 1.9179306
1.88534665 1.92443395 1.88912463 1.88133192]
mean value: 1.9048822164535522
key: score_time
value: [0.22852802 0.33917832 0.41764474 0.36378574 0.35395026 0.2694521
0.36723185 0.35790968 0.38749337 0.36625123]
mean value: 0.3451425313949585
key: test_mcc
value: [0.44365172 0.46163008 0.56414887 0.42171747 0.49245288 0.58062229
0.50077128 0.48863299 0.42745921 0.54708118]
mean value: 0.4928167996324935
key: train_mcc
value: [0.75398918 0.75743723 0.75458257 0.75600012 0.74854949 0.74906843
0.76161352 0.74975555 0.75843621 0.7478114 ]
mean value: 0.7537243705871026
key: test_fscore
value: [0.51327434 0.52631579 0.61538462 0.45714286 0.58064516 0.63865546
0.55357143 0.53211009 0.48148148 0.59649123]
mean value: 0.5495072451625744
key: train_fscore
value: [0.78647687 0.78820375 0.78755556 0.78787879 0.78026906 0.78026906
0.79362267 0.78214286 0.79040853 0.77956989]
mean value: 0.7856397032009309
key: test_precision
value: [0.74358974 0.76923077 0.85714286 0.8 0.73469388 0.84444444
0.81578947 0.82857143 0.76470588 0.85 ]
mean value: 0.8008168476567417
key: train_precision
value: [0.97356828 0.98 0.97149123 0.97571744 0.97533632 0.97752809
0.97603486 0.97333333 0.97587719 0.97533632]
mean value: 0.9754223069633239
key: test_recall
value: [0.39189189 0.4 0.48 0.32 0.48 0.51351351
0.41891892 0.39189189 0.35135135 0.45945946]
mean value: 0.4207027027027027
key: train_recall
value: [0.65970149 0.65919283 0.66218236 0.66068759 0.65022422 0.64925373
0.66865672 0.65373134 0.6641791 0.64925373]
mean value: 0.657706311491868
key: test_accuracy
value: [0.81355932 0.81694915 0.84745763 0.80677966 0.82372881 0.8537415
0.82993197 0.82653061 0.80952381 0.84353741]
mean value: 0.8271739882393636
key: train_accuracy
value: [0.90943396 0.91056604 0.90981132 0.91018868 0.90754717 0.90758204
0.91210864 0.90795926 0.91097699 0.90720483]
mean value: 0.9093378931410717
key: test_roc_auc
value: [0.67332151 0.67954545 0.72636364 0.64636364 0.71045455 0.74084767
0.69355037 0.68230958 0.65749386 0.71609337]
mean value: 0.6926343624578919
key: train_roc_auc
value: [0.82682044 0.82732483 0.82781001 0.82756742 0.82233573 0.82210289
0.83155198 0.8238369 0.82931318 0.82185049]
mean value: 0.8260513874317981
key: test_jcc
value: [0.3452381 0.35714286 0.44444444 0.2962963 0.40909091 0.4691358
0.38271605 0.3625 0.31707317 0.425 ]
mean value: 0.3808637624796161
key: train_jcc
value: [0.64809384 0.65044248 0.64956012 0.65 0.63970588 0.63970588
0.65785609 0.64222874 0.65345081 0.63876652]
mean value: 0.6469810361968263
MCC on Blind test: 0.16
Accuracy on Blind test: 0.58
Running classifier: 20
Model_name: Ridge Classifier
Model func: RidgeClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', RidgeClassifier(random_state=42))])
key: fit_time
value: [0.05359364 0.03821516 0.04512334 0.0454278 0.03872037 0.04183793
0.04194188 0.03860831 0.03863955 0.03841281]
mean value: 0.042052078247070315
key: score_time
value: [0.03482509 0.03591609 0.02700591 0.0271523 0.02563882 0.02626181
0.03588104 0.0358212 0.03579283 0.03475952]
mean value: 0.03190546035766602
key: test_mcc
value: [0.41320015 0.35426854 0.55414974 0.36769888 0.49479381 0.51426966
0.3938103 0.43514709 0.40996462 0.4837102 ]
mean value: 0.4421012991552603
key: train_mcc
value: [0.49870093 0.49531872 0.46520509 0.49298181 0.48422381 0.47093177
0.48996795 0.46960764 0.48691988 0.48065299]
mean value: 0.48345105821300116
key: test_fscore
value: [0.5 0.47154472 0.62295082 0.3960396 0.58730159 0.609375
0.49152542 0.51724138 0.51239669 0.56666667]
mean value: 0.527504189030197
key: train_fscore
value: [0.57567317 0.57564576 0.54562559 0.57716895 0.56261682 0.54648956
0.57142857 0.55255814 0.56719184 0.56485741]
mean value: 0.563925580735505
key: test_precision
value: [0.69047619 0.60416667 0.80851064 0.76923077 0.7254902 0.72222222
0.65909091 0.71428571 0.65957447 0.73913043]
mean value: 0.7092178209216491
key: train_precision
value: [0.76167076 0.75180723 0.73604061 0.74178404 0.75062344 0.75
0.74698795 0.73333333 0.74816626 0.73621103]
mean value: 0.7456624654163001
key: test_recall
value: [0.39189189 0.38666667 0.50666667 0.26666667 0.49333333 0.52702703
0.39189189 0.40540541 0.41891892 0.45945946]
mean value: 0.4247927927927928
key: train_recall
value: [0.46268657 0.46636771 0.43348281 0.47234679 0.44992526 0.42985075
0.46268657 0.44328358 0.45671642 0.45820896]
mean value: 0.453555540682239
key: test_accuracy
value: [0.80338983 0.77966102 0.8440678 0.79322034 0.82372881 0.82993197
0.79591837 0.80952381 0.79931973 0.82312925]
mean value: 0.8101890925861872
key: train_accuracy
value: [0.82754717 0.82641509 0.81773585 0.82528302 0.82339623 0.81969068
0.82459449 0.81855903 0.82384006 0.82157676]
mean value: 0.8228638392062801
key: test_roc_auc
value: [0.66653418 0.65015152 0.73287879 0.61969697 0.71484848 0.7294226
0.66185504 0.67542998 0.67309582 0.702457 ]
mean value: 0.6826370381076263
key: train_roc_auc
value: [0.70684833 0.70718689 0.69049204 0.70840964 0.69972285 0.69069519
0.70484152 0.69438283 0.70236124 0.70134072]
mean value: 0.7006281239529573
key: test_jcc
value: [0.33333333 0.30851064 0.45238095 0.24691358 0.41573034 0.43820225
0.3258427 0.34883721 0.34444444 0.39534884]
mean value: 0.36095442761140206
key: train_jcc
value: [0.4041721 0.40414508 0.37516171 0.40564827 0.39141743 0.37597911
0.4 0.38174807 0.39586028 0.39358974]
mean value: 0.3927721789122867
MCC on Blind test: 0.23
Accuracy on Blind test: 0.62
Running classifier: 21
Model_name: Ridge ClassifierCV
Model func: RidgeClassifierCV(cv=3)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', RidgeClassifierCV(cv=3))])
key: fit_time
value: [0.17352843 0.16714263 0.19189525 0.18127751 0.2042501 0.23863363
0.17749143 0.15182614 0.11910629 0.18887329]
mean value: 0.17940247058868408
key: score_time
value: [0.01320553 0.02053213 0.02036834 0.0203805 0.02118993 0.02252436
0.01326919 0.02037168 0.01357317 0.02479815]
mean value: 0.01902129650115967
key: test_mcc
value: [0.40539437 0.33723761 0.53098454 0.38163202 0.46906233 0.55475032
0.3938103 0.43514709 0.39035254 0.51311676]
mean value: 0.4411487883141567
key: train_mcc
value: [0.48738343 0.48166015 0.47500077 0.49106133 0.47558654 0.4591651
0.48996795 0.46960764 0.48671729 0.46867356]
mean value: 0.47848237425963924
key: test_fscore
value: [0.47272727 0.43478261 0.59322034 0.41176471 0.55737705 0.640625
0.49152542 0.51724138 0.48275862 0.57391304]
mean value: 0.5175935442675731
key: train_fscore
value: [0.55555556 0.55086372 0.54387657 0.56137012 0.54085603 0.52497551
0.57142857 0.55255814 0.55756422 0.54545455]
mean value: 0.5504502996172193
key: test_precision
value: [0.72222222 0.625 0.81395349 0.77777778 0.72340426 0.75925926
0.65909091 0.71428571 0.66666667 0.80487805]
mean value: 0.726653834177428
key: train_precision
value: [0.77540107 0.769437 0.76630435 0.77225131 0.77437326 0.76353276
0.74698795 0.73333333 0.76902887 0.74611399]
mean value: 0.7616763892318994
key: test_recall
value: [0.35135135 0.33333333 0.46666667 0.28 0.45333333 0.55405405
0.39189189 0.40540541 0.37837838 0.44594595]
mean value: 0.40603603603603605
key: train_recall
value: [0.43283582 0.42899851 0.42152466 0.44095665 0.41554559 0.4
0.46268657 0.44328358 0.43731343 0.42985075]
mean value: 0.43129955603150166
key: test_accuracy
value: [0.80338983 0.77966102 0.83728814 0.79661017 0.81694915 0.84353741
0.79591837 0.80952381 0.79591837 0.83333333]
mean value: 0.8112129597601753
key: train_accuracy
value: [0.82490566 0.82339623 0.82150943 0.82603774 0.82188679 0.81705017
0.82459449 0.81855903 0.82459449 0.81893625]
mean value: 0.8221470288890629
key: test_roc_auc
value: [0.65305124 0.63257576 0.71515152 0.62636364 0.69712121 0.74748157
0.66185504 0.67542998 0.65737101 0.70479115]
mean value: 0.6771192109427404
key: train_roc_auc
value: [0.69520579 0.69279304 0.68905612 0.69851972 0.68732858 0.67905098
0.70484152 0.69438283 0.69644571 0.69019039]
mean value: 0.692781468468996
key: test_jcc
value: [0.30952381 0.27777778 0.42168675 0.25925926 0.38636364 0.47126437
0.3258427 0.34883721 0.31818182 0.40243902]
mean value: 0.35211763462321277
key: train_jcc
value: [0.38461538 0.38013245 0.37350993 0.39021164 0.37066667 0.35590969
0.4 0.38174807 0.38654354 0.375 ]
mean value: 0.3798337377754252
MCC on Blind test: 0.2
Accuracy on Blind test: 0.6
Running classifier: 22
Model_name: SVC
Model func: SVC(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', SVC(random_state=42))])
key: fit_time
value: [0.30582404 0.34014201 0.34719992 0.34380317 0.34571886 0.34314227
0.3397162 0.33660436 0.33963943 0.34548593]
mean value: 0.33872761726379397
key: score_time
value: [0.08026695 0.08145833 0.09567952 0.09295559 0.09529305 0.09251738
0.08222485 0.08805132 0.09394479 0.09490585]
mean value: 0.08972976207733155
key: test_mcc
value: [0.29837255 0.31034785 0.43996404 0.41092838 0.38139239 0.42588908
0.37610336 0.34232149 0.39999756 0.42663556]
mean value: 0.381195226114471
key: train_mcc
value: [0.45994859 0.46230661 0.43778232 0.45660105 0.44045007 0.43729347
0.46573246 0.44366856 0.47010976 0.43998291]
mean value: 0.45138758069841867
key: test_fscore
value: [0.32989691 0.34343434 0.42857143 0.40816327 0.4 0.46153846
0.43396226 0.36734694 0.44230769 0.44 ]
mean value: 0.4055221301300997
key: train_fscore
value: [0.485623 0.48335124 0.46103896 0.47793326 0.45082873 0.44567627
0.48988285 0.46019629 0.49253731 0.46730975]
mean value: 0.4714377678536391
key: test_precision
value: [0.69565217 0.70833333 0.91304348 0.86956522 0.8 0.8
0.71875 0.75 0.76666667 0.84615385]
mean value: 0.7868164715719063
key: train_precision
value: [0.84758364 0.85877863 0.83529412 0.85384615 0.86440678 0.86637931
0.85501859 0.85425101 0.8619403 0.82889734]
mean value: 0.8526395866992781
key: test_recall
value: [0.21621622 0.22666667 0.28 0.26666667 0.26666667 0.32432432
0.31081081 0.24324324 0.31081081 0.2972973 ]
mean value: 0.2742702702702703
key: train_recall
value: [0.34029851 0.33632287 0.31838565 0.33183857 0.30493274 0.3
0.34328358 0.31492537 0.34477612 0.32537313]
mean value: 0.32601365370457136
key: test_accuracy
value: [0.77966102 0.77966102 0.81016949 0.80338983 0.79661017 0.80952381
0.79591837 0.78911565 0.80272109 0.80952381]
mean value: 0.7976294246512163
key: train_accuracy
value: [0.81773585 0.81849057 0.81207547 0.81698113 0.81245283 0.81139193
0.81931347 0.81327801 0.82044512 0.81252358]
mean value: 0.8154687942606207
key: test_roc_auc
value: [0.592271 0.59742424 0.63545455 0.62651515 0.6219697 0.6485258
0.63495086 0.60798526 0.63949631 0.63955774]
mean value: 0.6244150610915317
key: train_roc_auc
value: [0.65979572 0.65882272 0.64859212 0.65632817 0.64438964 0.64217567
0.66179828 0.64837637 0.66304934 0.65132867]
mean value: 0.6534656681680174
key: test_jcc
value: [0.19753086 0.20731707 0.27272727 0.25641026 0.25 0.3
0.27710843 0.225 0.28395062 0.28205128]
mean value: 0.2552095799575964
key: train_jcc
value: [0.32067511 0.31869688 0.29957806 0.31400283 0.29101284 0.28673324
0.32440056 0.29886686 0.32673267 0.3048951 ]
mean value: 0.30855941521581826
MCC on Blind test: 0.14
Accuracy on Blind test: 0.58
Running classifier: 23
Model_name: Stochastic GDescent
Model func: SGDClassifier(n_jobs=10, random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', SGDClassifier(n_jobs=10, random_state=42))])
key: fit_time
value: [0.07313251 0.11923933 0.10235262 0.11115861 0.11419773 0.10082531
0.1314714 0.08716035 0.10809422 0.11590266]
mean value: 0.10635347366333008
key: score_time
value: [0.01195264 0.01173186 0.01043296 0.01170778 0.01173329 0.01200604
0.01181054 0.01212859 0.01145005 0.01121402]
mean value: 0.011616778373718262
key: test_mcc
value: [0.28827318 0.23445058 0.40810098 0.17360008 0.45550587 0.44490362
0.34380402 0.38547461 0.37142832 0.23707113]
mean value: 0.33426123809938374
key: train_mcc
value: [0.29559321 0.29634492 0.45990022 0.19294247 0.48433127 0.44948315
0.48336624 0.39580896 0.37878443 0.30151016]
mean value: 0.3738065029802936
key: test_fscore
value: [0.49097473 0.20689655 0.57142857 0.07692308 0.57142857 0.59615385
0.51006711 0.41584158 0.4 0.19047619]
mean value: 0.40301902356286473
key: train_fscore
value: [0.49486166 0.25031928 0.60572988 0.10183876 0.56959707 0.60212647
0.61561119 0.42151482 0.37106184 0.25855513]
mean value: 0.4291216104473131
key: test_precision
value: [0.33497537 0.75 0.51612903 1. 0.65517241 0.46268657
0.50666667 0.77777778 0.76923077 0.8 ]
mean value: 0.6572638596348689
key: train_precision
value: [0.33655914 0.85964912 0.55708908 0.94736842 0.73522459 0.48164727
0.60755814 0.7966805 0.85026738 0.85714286]
mean value: 0.7029186497752251
key: test_recall
value: [0.91891892 0.12 0.64 0.04 0.50666667 0.83783784
0.51351351 0.28378378 0.27027027 0.10810811]
mean value: 0.4239099099099099
key: train_recall
value: [0.93432836 0.14648729 0.66367713 0.05381166 0.46487294 0.80298507
0.6238806 0.28656716 0.23731343 0.15223881]
mean value: 0.4366162461236419
key: test_accuracy
value: [0.5220339 0.76610169 0.7559322 0.7559322 0.80677966 0.71428571
0.75170068 0.79931973 0.79591837 0.76870748]
mean value: 0.7436711633806066
key: train_accuracy
value: [0.51773585 0.77849057 0.78188679 0.76037736 0.82264151 0.73179932
0.80309317 0.80120709 0.7966805 0.77932856]
mean value: 0.7573240713721415
key: test_roc_auc
value: [0.6540296 0.55318182 0.71772727 0.52 0.70787879 0.75528256
0.67266585 0.62825553 0.62149877 0.5495086 ]
mean value: 0.6380028776205247
key: train_roc_auc
value: [0.65554802 0.56920528 0.74274215 0.52640103 0.70416792 0.75535422
0.7437929 0.63091609 0.61158958 0.57182864]
mean value: 0.6511545836291296
key: test_jcc
value: [0.32535885 0.11538462 0.4 0.04 0.4 0.42465753
0.34234234 0.2625 0.25 0.10526316]
mean value: 0.2665506501542911
key: train_jcc
value: [0.32878151 0.14306569 0.43444227 0.05365127 0.39820743 0.4307446
0.44468085 0.26703755 0.2277937 0.14847162]
mean value: 0.28768764801286073
MCC on Blind test: 0.19
Accuracy on Blind test: 0.6
Running classifier: 24
Model_name: XGBoost
Model func: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
from pandas import MultiIndex, Int64Index
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
from pandas import MultiIndex, Int64Index
/home/tanu/git/LSHTM_analysis/scripts/ml/ml_functions/MultClfs_logo_skf.py:419: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
scoresDF_CV['source_data'] = 'CV'
/home/tanu/git/LSHTM_analysis/scripts/ml/ml_functions/MultClfs_logo_skf.py:446: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
scoresDF_BT['source_data'] = 'BT'
XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
colsample_bynode=None, colsample_bytree=None,
enable_categorical=False, gamma=None, gpu_id=None,
importance_type=None, interaction_constraints=None,
learning_rate=None, max_delta_step=None, max_depth=None,
min_child_weight=None, missing=nan, monotone_constraints=None,
n_estimators=100, n_jobs=None, num_parallel_tree=None,
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
scale_pos_weight=None, subsample=None, tree_method=None,
use_label_encoder=False, validate_parameters=None, verbosity=0)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', '...
interaction_constraints=None, learning_rate=None,
max_delta_step=None, max_depth=None,
min_child_weight=None, missing=nan,
monotone_constraints=None, n_estimators=100,
n_jobs=None, num_parallel_tree=None,
predictor=None, random_state=42, reg_alpha=None,
reg_lambda=None, scale_pos_weight=None,
subsample=None, tree_method=None,
use_label_encoder=False,
validate_parameters=None, verbosity=0))])
key: fit_time
value: [0.37598205 0.99958062 0.51176476 0.37363863 0.3554492 0.39733934
0.40253305 0.52792478 0.37787867 0.39326835]
mean value: 0.4715359449386597
key: score_time
value: [0.011971 0.01196456 0.01215982 0.01223326 0.01296926 0.01236844
0.01305866 0.01195025 0.01216412 0.01225519]
mean value: 0.012309455871582031
key: test_mcc
value: [0.44722252 0.47171869 0.46709937 0.48846092 0.5400443 0.6246624
0.47257551 0.53128137 0.40082443 0.57690863]
mean value: 0.5020798139669522
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.5483871 0.58015267 0.58208955 0.56666667 0.63076923 0.71014493
0.58646617 0.63157895 0.51968504 0.66153846]
mean value: 0.601747875943135
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.68 0.67857143 0.66101695 0.75555556 0.74545455 0.765625
0.66101695 0.71186441 0.62264151 0.76785714]
mean value: 0.7049603486957381
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.45945946 0.50666667 0.52 0.45333333 0.54666667 0.66216216
0.52702703 0.56756757 0.44594595 0.58108108]
mean value: 0.5269909909909909
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.81016949 0.81355932 0.81016949 0.82372881 0.83728814 0.86394558
0.81292517 0.83333333 0.79251701 0.85034014]
mean value: 0.8247976478727084
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: [0.69353063 0.71242424 0.71454545 0.70166667 0.74151515 0.79699017
0.71805897 0.74514742 0.67751843 0.76099509]
mean value: 0.7262392223568696
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.37777778 0.40860215 0.41052632 0.39534884 0.46067416 0.5505618
0.41489362 0.46153846 0.35106383 0.49425287]
mean value: 0.43252398182805585
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: 0.18
Accuracy on Blind test: 0.59
Extracting tts_split_name: logo_skf_BT_katg
Total cols in each df:
CV df: 8
metaDF: 17
Adding column: Model_name
Total cols in bts df:
BT_df: 8
First proceeding to rowbind CV and BT dfs:
Final output should have: 25 columns
Combinig 2 using pd.concat by row ~ rowbind
Checking Dims of df to combine:
Dim of CV: (24, 8)
Dim of BT: (24, 8)
8
Number of Common columns: 8
These are: ['Precision', 'Accuracy', 'source_data', 'F1', 'Recall', 'MCC', 'ROC_AUC', 'JCC']
Concatenating dfs with different resampling methods [WF]:
Split type: logo_skf_BT_katg
No. of dfs combining: 2
PASS: 2 dfs successfully combined
nrows in combined_df_wf: 48
ncols in combined_df_wf: 8
PASS: proceeding to merge metadata with CV and BT dfs
Adding column: Model_name
=========================================================
SUCCESS: Ran multiple classifiers
=======================================================
BTS gene: rpob
Total genes: 6
Training on: 4
Training on genes: ['katg', 'pnca', 'gid', 'embb']
Omitted genes: ['alr', 'rpob']
Blind test gene: rpob
/home/tanu/git/Data/ml_combined/5genes_logo_skf_BT_rpob.csv
Training data dim: (2630, 171)
Training Target dim: (2630,)
Checked training df does NOT have Target var
TEST data dim: (1132, 171)
TEST Target dim: (1132,)
==============================================================
Running several classification models (n): 24
List of models:
('AdaBoost Classifier', AdaBoostClassifier(random_state=42))
('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42))
('Decision Tree', DecisionTreeClassifier(random_state=42))
('Extra Tree', ExtraTreeClassifier(random_state=42))
('Extra Trees', ExtraTreesClassifier(random_state=42))
('Gradient Boosting', GradientBoostingClassifier(random_state=42))
('Gaussian NB', GaussianNB())
('Gaussian Process', GaussianProcessClassifier(random_state=42))
('K-Nearest Neighbors', KNeighborsClassifier())
('LDA', LinearDiscriminantAnalysis())
('Logistic Regression', LogisticRegression(random_state=42))
('Logistic RegressionCV', LogisticRegressionCV(cv=3, random_state=42))
('MLP', MLPClassifier(max_iter=500, random_state=42))
('Multinomial', MultinomialNB())
('Naive Bayes', BernoulliNB())
('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42))
('QDA', QuadraticDiscriminantAnalysis())
('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42))
('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
n_estimators=1000, n_jobs=10, oob_score=True,
random_state=42))
('Ridge Classifier', RidgeClassifier(random_state=42))
('Ridge ClassifierCV', RidgeClassifierCV(cv=3))
('SVC', SVC(random_state=42))
('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42))
('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
colsample_bynode=None, colsample_bytree=None,
enable_categorical=False, gamma=None, gpu_id=None,
importance_type=None, interaction_constraints=None,
learning_rate=None, max_delta_step=None, max_depth=None,
min_child_weight=None, missing=nan, monotone_constraints=None,
n_estimators=100, n_jobs=None, num_parallel_tree=None,
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
scale_pos_weight=None, subsample=None, tree_method=None,
use_label_encoder=False, validate_parameters=None, verbosity=0))
================================================================
Running classifier: 1
Model_name: AdaBoost Classifier
Model func: AdaBoostClassifier(random_state=42)
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', AdaBoostClassifier(random_state=42))])
key: fit_time
value: [0.66015625 0.64349842 0.66521692 0.69238138 0.59047127 0.58339763
0.58404994 0.58687758 0.6029377 0.60131979]
mean value: 0.6210306882858276
key: score_time
value: [0.01868844 0.018893 0.01834679 0.01899505 0.01799107 0.01817083
0.01774406 0.01793408 0.0196197 0.01975226]
mean value: 0.018613529205322266
key: test_mcc
value: [0.51831691 0.43768116 0.4192198 0.36115808 0.45183855 0.50892531
0.40869104 0.31569349 0.50622204 0.51345425]
mean value: 0.44412006303965407
key: train_mcc
value: [0.54779044 0.55328679 0.54400654 0.56308401 0.58419658 0.54208228
0.55469176 0.57402225 0.54824809 0.55110191]
mean value: 0.5562510652455857
key: test_fscore
value: [0.64827586 0.57352941 0.57342657 0.52173913 0.58992806 0.62773723
0.57142857 0.47761194 0.6442953 0.63309353]
mean value: 0.5861065600446713
key: train_fscore
value: [0.6645817 0.66875 0.66091052 0.67603435 0.69438029 0.66462011
0.67081712 0.68812261 0.66614786 0.67076923]
mean value: 0.6725133787330517
key: test_precision
value: [0.70149254 0.67241379 0.63076923 0.6 0.67213115 0.72881356
0.61764706 0.58181818 0.68571429 0.73333333]
mean value: 0.6624133127738461
key: train_precision
value: [0.7417103 0.7456446 0.74119718 0.75304348 0.76053963 0.72529313
0.74310345 0.74833333 0.73793103 0.73277311]
mean value: 0.7429569244015735
key: test_recall
value: [0.6025641 0.5 0.52564103 0.46153846 0.52564103 0.55128205
0.53164557 0.40506329 0.60759494 0.55696203]
mean value: 0.5267932489451478
key: train_recall
value: [0.601983 0.60623229 0.59631728 0.61331445 0.6388102 0.61331445
0.61134752 0.63687943 0.6070922 0.61843972]
mean value: 0.6143730536636329
key: test_accuracy
value: [0.80608365 0.77946768 0.76806084 0.74904943 0.78326996 0.80608365
0.76045627 0.7338403 0.79847909 0.80608365]
mean value: 0.7790874524714829
key: train_accuracy
value: [0.81875792 0.8208703 0.81749049 0.82467258 0.83227714 0.81537812
0.82129278 0.82805239 0.81875792 0.8191804 ]
mean value: 0.8216730038022814
key: test_roc_auc
value: [0.747228 0.69864865 0.69795565 0.66590437 0.70876646 0.73239778
0.69517061 0.64003165 0.74401486 0.73500275]
mean value: 0.7065120768815045
key: train_roc_auc
value: [0.75644003 0.75916672 0.75390819 0.76391189 0.77665977 0.75728937
0.76084825 0.77301252 0.75781806 0.76138592]
mean value: 0.7620440713640926
key: test_jcc
value: [0.47959184 0.40206186 0.40196078 0.35294118 0.41836735 0.45744681
0.4 0.31372549 0.47524752 0.46315789]
mean value: 0.4164500718323921
key: train_jcc
value: [0.49765808 0.50234742 0.49355217 0.51061321 0.53183962 0.49770115
0.50468384 0.52453271 0.49941657 0.50462963]
mean value: 0.5066974395983235
MCC on Blind test: 0.28
Accuracy on Blind test: 0.73
Running classifier: 2
Model_name: Bagging Classifier
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model',
BaggingClassifier(n_jobs=10, oob_score=True,
random_state=42))])
key: fit_time
value: [0.29936886 0.33346748 0.36325121 0.35919499 0.34344101 0.34532547
0.33557439 0.34263635 0.35180497 0.34542799]
mean value: 0.3419492721557617
key: score_time
value: [0.04276228 0.04156923 0.04079032 0.04497981 0.03811145 0.04596567
0.04487777 0.0449152 0.04076576 0.04532838]
mean value: 0.04300658702850342
key: test_mcc
value: [0.40914347 0.48488848 0.49600301 0.33773385 0.4605833 0.44164626
0.37383545 0.34805236 0.47885221 0.39451236]
mean value: 0.4225250755982509
key: train_mcc
value: [0.95666791 0.96568474 0.95863865 0.95771693 0.95256253 0.96269828
0.96364031 0.94345992 0.97069743 0.96467977]
mean value: 0.9596446454221755
key: test_fscore
value: [0.55474453 0.60150376 0.6119403 0.5106383 0.57364341 0.5648855
0.51515152 0.5037037 0.59701493 0.53030303]
mean value: 0.5563528962893047
key: train_fscore
value: [0.96868172 0.97543353 0.97022513 0.96938776 0.96576839 0.97316896
0.97391304 0.95888399 0.97909156 0.97461929]
mean value: 0.9709173371940375
key: test_precision
value: [0.6440678 0.72727273 0.73214286 0.57142857 0.7254902 0.69811321
0.64150943 0.60714286 0.72727273 0.66037736]
mean value: 0.673481773294834
key: train_precision
value: [0.9970015 0.99557522 0.99552906 0.9984985 0.994003 0.99702823
0.99555556 0.99391172 0.99560117 0.99703264]
mean value: 0.9959736599854068
key: test_recall
value: [0.48717949 0.51282051 0.52564103 0.46153846 0.47435897 0.47435897
0.43037975 0.43037975 0.50632911 0.44303797]
mean value: 0.4746024018175917
key: train_recall
value: [0.94192635 0.95609065 0.94617564 0.94192635 0.93909348 0.95042493
0.95319149 0.92624113 0.96312057 0.95319149]
mean value: 0.9471382074618768
key: test_accuracy
value: [0.76806084 0.79847909 0.80228137 0.73764259 0.79087452 0.78326996
0.75665399 0.74524715 0.79467681 0.76425856]
mean value: 0.7741444866920152
key: train_accuracy
value: [0.98183354 0.98563583 0.9826785 0.98225602 0.98014364 0.9843684
0.98479087 0.97634136 0.9877482 0.98521335]
mean value: 0.9831009716941276
key: test_roc_auc
value: [0.68683299 0.71586972 0.72227997 0.65779626 0.69934165 0.69393624
0.66355944 0.65540726 0.71240369 0.67260594]
mean value: 0.6880033160674328
key: train_roc_auc
value: [0.97036113 0.97714226 0.97218475 0.97066215 0.96834265 0.97461042
0.97569322 0.9619172 0.98065776 0.97599406]
mean value: 0.9727565576808663
key: test_jcc
value: [0.38383838 0.43010753 0.44086022 0.34285714 0.40217391 0.39361702
0.34693878 0.33663366 0.42553191 0.36082474]
mean value: 0.38633832989892836
key: train_jcc
value: [0.93926554 0.95204513 0.94217207 0.94059406 0.93380282 0.94774011
0.94915254 0.92101551 0.95903955 0.95049505]
mean value: 0.9435322388069156
MCC on Blind test: 0.29
Accuracy on Blind test: 0.74
Running classifier: 3
Model_name: Decision Tree
Model func: DecisionTreeClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', DecisionTreeClassifier(random_state=42))])
key: fit_time
value: [0.17749643 0.15449786 0.15684819 0.16370392 0.16305971 0.16301155
0.15757155 0.16442728 0.1584146 0.18407607]
mean value: 0.16431071758270263
key: score_time
value: [0.0099802 0.01019144 0.01005006 0.0099442 0.01092458 0.01035547
0.01049137 0.00983071 0.00986147 0.00964165]
mean value: 0.010127115249633788
key: test_mcc
value: [0.31906587 0.33446456 0.39769402 0.31666526 0.33966926 0.1979649
0.32396696 0.20508348 0.44618127 0.28422068]
mean value: 0.31649762610519955
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.52229299 0.5398773 0.58682635 0.52760736 0.54545455 0.44444444
0.52830189 0.43421053 0.61146497 0.49350649]
mean value: 0.5233986868179241
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.51898734 0.51764706 0.5505618 0.50588235 0.51724138 0.42857143
0.525 0.45205479 0.61538462 0.50666667]
mean value: 0.513799743574327
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.52564103 0.56410256 0.62820513 0.55128205 0.57692308 0.46153846
0.53164557 0.41772152 0.60759494 0.48101266]
mean value: 0.5345666991236611
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.7148289 0.7148289 0.73764259 0.70722433 0.7148289 0.65779468
0.7148289 0.6730038 0.76806084 0.70342205]
mean value: 0.7106463878326996
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: [0.66011781 0.67124047 0.70599446 0.66212751 0.67494802 0.6010395
0.66256192 0.60016511 0.72227573 0.63996285]
mean value: 0.6600433378109491
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.35344828 0.3697479 0.41525424 0.35833333 0.375 0.28571429
0.35897436 0.27731092 0.44036697 0.32758621]
mean value: 0.356173649407521
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: 0.16
Accuracy on Blind test: 0.65
Running classifier: 4
Model_name: Extra Tree
Model func: ExtraTreeClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', ExtraTreeClassifier(random_state=42))])
key: fit_time
value: [0.02024841 0.02110362 0.02106261 0.02138305 0.02130175 0.02117705
0.02258706 0.02065539 0.02072144 0.02019024]
mean value: 0.021043062210083008
key: score_time
value: [0.01071525 0.00990891 0.01038027 0.01048565 0.01068664 0.01060414
0.01065707 0.00999355 0.01039839 0.01039553]
mean value: 0.010422539710998536
key: test_mcc
value: [0.33229227 0.19833083 0.32294891 0.26994613 0.2597154 0.14934719
0.3046154 0.17434582 0.22682553 0.15574422]
mean value: 0.23941116892012254
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.52903226 0.43137255 0.5308642 0.46896552 0.45390071 0.4
0.51851852 0.41059603 0.46987952 0.39735099]
mean value: 0.4610480287534583
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.53246753 0.44 0.51190476 0.50746269 0.50793651 0.4025974
0.5060241 0.43055556 0.44827586 0.41666667]
mean value: 0.47038910721500987
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.52564103 0.42307692 0.55128205 0.43589744 0.41025641 0.3974359
0.53164557 0.39240506 0.49367089 0.37974684]
mean value: 0.4541058098020123
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.72243346 0.66920152 0.71102662 0.70722433 0.70722433 0.64638783
0.70342205 0.66159696 0.66539924 0.6539924 ]
mean value: 0.6847908745247148
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: [0.66552322 0.59802495 0.66483021 0.62875953 0.62134442 0.57439362
0.65440974 0.58478949 0.61640066 0.57574298]
mean value: 0.6184218825743317
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.35964912 0.275 0.36134454 0.30630631 0.29357798 0.25
0.35 0.25833333 0.30708661 0.24793388]
mean value: 0.30092317803839086
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: 0.23
Accuracy on Blind test: 0.7
Running classifier: 5
Model_name: Extra Trees
Model func: ExtraTreesClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', ExtraTreesClassifier(random_state=42))])
key: fit_time
value: [0.4247551 0.42928338 0.4232676 0.41636801 0.41626096 0.4142983
0.4137609 0.41834641 0.40959477 0.41176152]
mean value: 0.4177696943283081
key: score_time
value: [0.02388263 0.02427459 0.0238328 0.02376556 0.02363038 0.02305984
0.02357888 0.02385139 0.02350998 0.02351928]
mean value: 0.02369053363800049
key: test_mcc
value: [0.44394926 0.42793462 0.47566804 0.37072092 0.41234419 0.37072092
0.43044769 0.31586605 0.41518928 0.35990604]
mean value: 0.40227470093356743
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.57142857 0.546875 0.59701493 0.496 0.5203252 0.496
0.53968254 0.44262295 0.54545455 0.51470588]
mean value: 0.5270109618363437
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.69090909 0.7 0.71428571 0.65957447 0.71111111 0.65957447
0.72340426 0.62790698 0.67924528 0.61403509]
mean value: 0.6780046455277631
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.48717949 0.44871795 0.51282051 0.3974359 0.41025641 0.3974359
0.43037975 0.34177215 0.4556962 0.44303797]
mean value: 0.43247322297955204
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.78326996 0.77946768 0.79467681 0.76045627 0.7756654 0.76045627
0.77946768 0.74144487 0.77186312 0.74904943]
mean value: 0.7695817490494298
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: [0.6976438 0.68381843 0.71316701 0.65547471 0.66999307 0.65547471
0.67986379 0.62740782 0.68165245 0.66173638]
mean value: 0.6726232154850758
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.4 0.37634409 0.42553191 0.32978723 0.35164835 0.32978723
0.36956522 0.28421053 0.375 0.34653465]
mean value: 0.3588409217821021
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: 0.31
Accuracy on Blind test: 0.75
Running classifier: 6
Model_name: Gradient Boosting
Model func: GradientBoostingClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', GradientBoostingClassifier(random_state=42))])
key: fit_time
value: [2.87932181 2.78957963 2.7834518 2.77825069 2.76456809 2.76696944
2.76816607 2.86399889 2.86249971 2.82354784]
mean value: 2.808035397529602
key: score_time
value: [0.01013947 0.01029181 0.01042128 0.01027775 0.01003551 0.01028013
0.01038885 0.01131272 0.01032209 0.01034522]
mean value: 0.010381484031677246
key: test_mcc
value: [0.50892531 0.5983093 0.53742235 0.39219112 0.46100057 0.49791671
0.44295535 0.37687801 0.52248923 0.53169649]
mean value: 0.4869784450891288
key: train_mcc
value: [0.69414595 0.68541368 0.68792198 0.69313975 0.69423834 0.70176262
0.69709707 0.69093456 0.69934846 0.68739646]
mean value: 0.6931398880556389
key: test_fscore
value: [0.62773723 0.69117647 0.66206897 0.55555556 0.60689655 0.61764706
0.57971014 0.52238806 0.63768116 0.64233577]
mean value: 0.6143196958958749
key: train_fscore
value: [0.76801267 0.76130056 0.76622361 0.76850394 0.76947286 0.77337559
0.76837061 0.76875 0.77287066 0.76400947]
mean value: 0.7680889963564856
key: test_precision
value: [0.72881356 0.81034483 0.71641791 0.60606061 0.65671642 0.72413793
0.6779661 0.63636364 0.74576271 0.75862069]
mean value: 0.7061204391939669
key: train_precision
value: [0.87073609 0.86486486 0.85514834 0.86524823 0.86548673 0.87769784
0.87934186 0.85565217 0.87033748 0.86120996]
mean value: 0.8665723568280839
key: test_recall
value: [0.55128205 0.6025641 0.61538462 0.51282051 0.56410256 0.53846154
0.50632911 0.44303797 0.55696203 0.55696203]
mean value: 0.5447906523855892
key: train_recall
value: [0.68696884 0.67988669 0.69405099 0.69121813 0.69263456 0.69121813
0.6822695 0.69787234 0.69503546 0.68652482]
mean value: 0.6897679464770056
key: test_accuracy
value: [0.80608365 0.84030418 0.81368821 0.75665399 0.78326996 0.80228137
0.77946768 0.75665399 0.80988593 0.81368821]
mean value: 0.7961977186311786
key: train_accuracy
value: [0.87621462 0.87283481 0.87367976 0.87579214 0.87621462 0.87917195
0.87748204 0.87494719 0.878327 0.87367976]
mean value: 0.8758343895226025
key: test_roc_auc
value: [0.73239778 0.77155232 0.75634096 0.68613999 0.71988912 0.72598753
0.70153412 0.66717116 0.73772014 0.74043753]
mean value: 0.7239170653232293
key: train_roc_auc
value: [0.82181073 0.81736658 0.82204055 0.82273128 0.8234395 0.82513947
0.82127916 0.82396625 0.82555624 0.81979671]
mean value: 0.8223126458879658
key: test_jcc
value: [0.45744681 0.52808989 0.49484536 0.38461538 0.43564356 0.44680851
0.40816327 0.35353535 0.46808511 0.47311828]
mean value: 0.44503515213802947
key: train_jcc
value: [0.62339332 0.61459667 0.62103929 0.62404092 0.62531969 0.63049096
0.62386511 0.62436548 0.62982005 0.61813538]
mean value: 0.623506686790386
MCC on Blind test: 0.31
Accuracy on Blind test: 0.74
Running classifier: 7
Model_name: Gaussian NB
Model func: GaussianNB()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', GaussianNB())])
key: fit_time
value: [0.01901984 0.01852489 0.01955152 0.01933932 0.01972651 0.01879215
0.0193212 0.01832676 0.01921535 0.02068949]
mean value: 0.019250702857971192
key: score_time
value: [0.01011157 0.01024199 0.01047206 0.01015615 0.01033425 0.01011229
0.01036572 0.00990176 0.01060867 0.01106262]
mean value: 0.010336709022521973
key: test_mcc
value: [0.35905972 0.34533294 0.24640606 0.1931669 0.2499428 0.21623056
0.31290221 0.33139431 0.26995551 0.24273057]
mean value: 0.2767121579708969
key: train_mcc
value: [0.27709423 0.28124335 0.29494256 0.29091076 0.29208286 0.29251736
0.27560588 0.29835779 0.27683058 0.28221158]
mean value: 0.28617969505619933
key: test_fscore
value: [0.57711443 0.56701031 0.51612903 0.47120419 0.50777202 0.46783626
0.54545455 0.56122449 0.53211009 0.50761421]
mean value: 0.5253469576105669
key: train_fscore
value: [0.52714932 0.52747253 0.53927577 0.535815 0.53777778 0.53786192
0.5260181 0.53788317 0.52749719 0.5289067 ]
mean value: 0.5325657471090228
key: test_precision
value: [0.47154472 0.47413793 0.4028777 0.39823009 0.42608696 0.43010753
0.47222222 0.47008547 0.41726619 0.42372881]
mean value: 0.4386287609139773
key: train_precision
value: [0.43879473 0.4457478 0.44444444 0.44517338 0.44241316 0.44311927
0.4374412 0.45410156 0.4363974 0.44337812]
mean value: 0.44310110698665495
key: test_recall
value: [0.74358974 0.70512821 0.71794872 0.57692308 0.62820513 0.51282051
0.64556962 0.69620253 0.73417722 0.63291139]
mean value: 0.6593476144109056
key: train_recall
value: [0.66005666 0.64589235 0.68555241 0.67280453 0.68555241 0.68413598
0.65957447 0.65957447 0.66666667 0.65531915]
mean value: 0.6675129086050671
key: test_accuracy
value: [0.67680608 0.68060837 0.60076046 0.61596958 0.63878327 0.6539924
0.67680608 0.6730038 0.6121673 0.63117871]
mean value: 0.6460076045627376
key: train_accuracy
value: [0.64681031 0.65483735 0.65061259 0.65230249 0.64850021 0.64934516
0.64596536 0.66244191 0.64427545 0.65230249]
mean value: 0.6507393324883819
key: test_roc_auc
value: [0.6961192 0.68769924 0.63465003 0.60467775 0.63572419 0.61316701
0.66789351 0.679623 0.64697991 0.63167309]
mean value: 0.649820693221904
key: train_roc_auc
value: [0.65061833 0.65226586 0.660657 0.65819637 0.65915188 0.65934674
0.6498835 0.66161636 0.65072202 0.65317101]
mean value: 0.6555629067100331
key: test_jcc
value: [0.40559441 0.39568345 0.34782609 0.30821918 0.34027778 0.30534351
0.375 0.39007092 0.3625 0.34013605]
mean value: 0.35706513895062725
key: train_jcc
value: [0.35791091 0.35820896 0.36918383 0.36594761 0.36778116 0.36785986
0.35686876 0.36787975 0.35823171 0.35953307]
mean value: 0.3629405612767182
MCC on Blind test: 0.25
Accuracy on Blind test: 0.69
Running classifier: 8
Model_name: Gaussian Process
Model func: GaussianProcessClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', GaussianProcessClassifier(random_state=42))])
key: fit_time
value: [2.61084819 2.54851556 2.75464892 2.58350253 2.5580616 2.51013517
2.6775713 2.81703162 2.55160832 2.69899321]
mean value: 2.6310916423797606
key: score_time
value: [0.07431507 0.07787752 0.0965147 0.0738337 0.0884769 0.09142637
0.07430339 0.07444596 0.07457805 0.07390952]
mean value: 0.07996811866760253
key: test_mcc
value: [0.37111431 0.35611954 0.32267607 0.21921066 0.24604652 0.27406546
0.33124797 0.25319831 0.21281492 0.11132748]
mean value: 0.2697821237119716
key: train_mcc
value: [0.63366427 0.63228467 0.62235103 0.62379874 0.63228467 0.63374234
0.63406136 0.63054939 0.63074968 0.62311908]
mean value: 0.6296605243209253
key: test_fscore
value: [0.45614035 0.41121495 0.42105263 0.3047619 0.35714286 0.34285714
0.43103448 0.31067961 0.32432432 0.25225225]
mean value: 0.36114605114747556
key: train_fscore
value: [0.67759563 0.68050542 0.66969973 0.67389341 0.68050542 0.68458781
0.68231047 0.67217631 0.67518248 0.67389341]
mean value: 0.6770350070192144
key: test_precision
value: [0.72222222 0.75862069 0.66666667 0.59259259 0.58823529 0.66666667
0.67567568 0.66666667 0.5625 0.4375 ]
mean value: 0.633734647426331
key: train_precision
value: [0.94897959 0.93781095 0.93638677 0.93017456 0.93781095 0.93170732
0.93796526 0.953125 0.94629156 0.9278607 ]
mean value: 0.9388112648661648
key: test_recall
value: [0.33333333 0.28205128 0.30769231 0.20512821 0.25641026 0.23076923
0.3164557 0.20253165 0.2278481 0.17721519]
mean value: 0.25394352482960075
key: train_recall
value: [0.52691218 0.53399433 0.52124646 0.52832861 0.53399433 0.54107649
0.53617021 0.51914894 0.5248227 0.52907801]
mean value: 0.5294772266088039
key: test_accuracy
value: [0.76425856 0.76045627 0.74904943 0.72243346 0.72623574 0.73764259
0.74904943 0.73003802 0.7148289 0.68441065]
mean value: 0.7338403041825095
key: train_accuracy
value: [0.8504436 0.8504436 0.84664132 0.84748627 0.8504436 0.85128855
0.85128855 0.84917617 0.84959865 0.84748627]
mean value: 0.8494296577946768
key: test_roc_auc
value: [0.63963964 0.62210672 0.62141372 0.57283437 0.59036729 0.59106029
0.62561915 0.57952669 0.57588057 0.53969455]
mean value: 0.5958143006051647
key: train_roc_auc
value: [0.75743562 0.75947158 0.75309764 0.75573565 0.75947158 0.76210959
0.76056405 0.75415931 0.75609366 0.75581458]
mean value: 0.7573953248239292
key: test_jcc
value: [0.29545455 0.25882353 0.26666667 0.17977528 0.2173913 0.20689655
0.27472527 0.18390805 0.19354839 0.1443299 ]
mean value: 0.2221519483210094
key: train_jcc
value: [0.51239669 0.51573187 0.50341997 0.50817439 0.51573187 0.52043597
0.51780822 0.50622407 0.50964187 0.50817439]
mean value: 0.5117739315135883
MCC on Blind test: 0.18
Accuracy on Blind test: 0.72
Running classifier: 9
Model_name: K-Nearest Neighbors
Model func: KNeighborsClassifier()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', KNeighborsClassifier())])
key: fit_time
value: [0.02105522 0.01499987 0.01559758 0.01560879 0.0157392 0.01604772
0.01743364 0.01631331 0.01584506 0.01569533]
mean value: 0.016433572769165038
key: score_time
value: [0.05166197 0.02500176 0.02847052 0.03002191 0.02547193 0.0282352
0.02619004 0.02469611 0.02487302 0.02473855]
mean value: 0.028936100006103516
key: test_mcc
value: [0.27705301 0.26196784 0.26760087 0.21336636 0.20396719 0.15737472
0.26602372 0.30054781 0.12450962 0.0992627 ]
mean value: 0.21716738401331184
key: train_mcc
value: [0.46750257 0.48752864 0.4801803 0.49876179 0.51098795 0.47980637
0.50192052 0.4829467 0.50470364 0.48677205]
mean value: 0.49011105319315257
key: test_fscore
value: [0.45588235 0.44927536 0.43939394 0.39694656 0.4 0.36090226
0.43076923 0.44444444 0.36111111 0.28099174]
mean value: 0.40197169970405267
key: train_fscore
value: [0.58606213 0.60545906 0.59563758 0.61639344 0.62510254 0.59427609
0.61319967 0.59714045 0.61679135 0.60414938]
mean value: 0.6054211706127077
key: test_precision
value: [0.53448276 0.51666667 0.53703704 0.49056604 0.47368421 0.43636364
0.54901961 0.59574468 0.4 0.4047619 ]
mean value: 0.4938326540406301
key: train_precision
value: [0.71958763 0.72763419 0.73045267 0.73151751 0.74269006 0.73236515
0.74593496 0.73347107 0.74497992 0.728 ]
mean value: 0.733663316543796
key: test_recall
value: [0.3974359 0.3974359 0.37179487 0.33333333 0.34615385 0.30769231
0.35443038 0.35443038 0.32911392 0.21518987]
mean value: 0.34070107108081793
key: train_recall
value: [0.49433428 0.5184136 0.50283286 0.5325779 0.53966006 0.5
0.52056738 0.5035461 0.52624113 0.51631206]
mean value: 0.515448536355052
key: test_accuracy
value: [0.71863118 0.71102662 0.71863118 0.69961977 0.69201521 0.67680608
0.71863118 0.7338403 0.65019011 0.66920152]
mean value: 0.6988593155893537
key: train_accuracy
value: [0.79171948 0.79847909 0.79636671 0.80228137 0.8069286 0.79636671
0.80439375 0.79763414 0.8052387 0.79847909]
mean value: 0.7997887621461766
key: test_roc_auc
value: [0.62574498 0.62033957 0.61832987 0.59369369 0.59199584 0.57006237
0.61471519 0.62558476 0.5585787 0.53966015]
mean value: 0.5958705120386464
key: train_roc_auc
value: [0.70622795 0.71796658 0.71198235 0.72474771 0.73009493 0.71116797
0.72267839 0.71296439 0.72491359 0.71724147]
mean value: 0.7179985328072309
key: test_jcc
value: [0.2952381 0.28971963 0.2815534 0.24761905 0.25 0.22018349
0.2745098 0.28571429 0.22033898 0.16346154]
mean value: 0.25283382644703917
key: train_jcc /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
value: [0.41448931 0.4341637 0.42413381 0.44549763 0.45465394 0.42275449
0.44216867 0.42565947 0.44591346 0.43281807]
mean value: 0.4342252565140387
MCC on Blind test: 0.19
Accuracy on Blind test: 0.72
Running classifier: 10
Model_name: LDA
Model func: LinearDiscriminantAnalysis()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', LinearDiscriminantAnalysis())])
key: fit_time
value: [0.09200859 0.11468816 0.10380125 0.10329723 0.0986588 0.10788202
0.09894681 0.10102844 0.0991993 0.09919643]
mean value: 0.10187070369720459
key: score_time
value: [0.01303577 0.01304603 0.01310992 0.01307225 0.01304531 0.01301193
0.01306033 0.01320028 0.01306772 0.01309204]
mean value: 0.013074159622192383
key: test_mcc
value: [0.50083535 0.51177626 0.52702899 0.40425218 0.44657327 0.43499812
0.46868711 0.36702471 0.48161061 0.38225033]
mean value: 0.4525036933348391
key: train_mcc
value: [0.5370565 0.53492299 0.52146622 0.52398752 0.5340121 0.53535731
0.53705091 0.53818683 0.52936984 0.53755378]
mean value: 0.5328963977512504
key: test_fscore
value: [0.64473684 0.65359477 0.66225166 0.55714286 0.5915493 0.56716418
0.61744966 0.52857143 0.62068966 0.54545455]
mean value: 0.5988604894626134
key: train_fscore
value: [0.65998458 0.65844256 0.64923077 0.65019305 0.66057839 0.65789474
0.65891473 0.66258607 0.6515625 0.65996909]
mean value: 0.6569356474186483
key: test_precision
value: [0.66216216 0.66666667 0.68493151 0.62903226 0.65625 0.67857143
0.65714286 0.60655738 0.68181818 0.609375 ]
mean value: 0.6532507438324308
key: train_precision
value: [0.72419628 0.72250423 0.71043771 0.7147708 0.71381579 0.72525597
0.72649573 0.7192691 0.72521739 0.72495756]
mean value: 0.7206920554152878
key: test_recall
value: [0.62820513 0.64102564 0.64102564 0.5 0.53846154 0.48717949
0.58227848 0.46835443 0.56962025 0.49367089]
mean value: 0.5549821486530349
key: train_recall
value: [0.60623229 0.60481586 0.59773371 0.59631728 0.61473088 0.601983
0.60283688 0.6141844 0.59148936 0.60567376]
mean value: 0.6035997428324593
key: test_accuracy
value: [0.79467681 0.79847909 0.80608365 0.76425856 0.77946768 0.77946768
0.78326996 0.74904943 0.79087452 0.75285171]
mean value: 0.7798479087452471
key: train_accuracy
value: [0.81368821 0.81284326 0.80735108 0.8086185 0.81157583 0.81326574
0.81411069 0.81368821 0.81157583 0.81411069]
mean value: 0.8120828052386988
key: test_roc_auc
value: [0.746535 0.75294525 0.75835066 0.68783784 0.70977131 0.69494109
0.72592185 0.66895982 0.72774491 0.67890066]
mean value: 0.715190839308891
key: train_roc_auc
value: [0.75404932 0.75304008 0.74709082 0.7475867 0.75498735 0.75252672
0.75328366 0.75624984 0.74821159 0.75410042]
mean value: 0.7521126496637152
key: test_jcc
value: [0.47572816 0.48543689 0.4950495 0.38613861 0.42 0.39583333
0.44660194 0.3592233 0.45 0.375 ]
mean value: 0.42890117434073505
key: train_jcc
value: [0.49252014 0.4908046 0.48063781 0.48169336 0.49318182 0.49019608
0.49132948 0.49542334 0.48319815 0.49250288]
mean value: 0.48914876596988827
MCC on Blind test: 0.24
Accuracy on Blind test: 0.73
Running classifier: 11
Model_name: Logistic Regression
Model func: LogisticRegression(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', LogisticRegression(random_state=42))])
key: fit_time
value: [0.07033682 0.05937266 0.06054139 0.05784941 0.06276155 0.05941129
0.11057734 0.10262299 0.09507155 0.0658288 ]
mean value: 0.07443737983703613
key: score_time
value: [0.01712704 0.01369071 0.0136168 0.01382732 0.01479244 0.01578426
0.01486659 0.01689029 0.01403666 0.01392746]
mean value: 0.01485595703125
key: test_mcc
value: [0.51615251 0.54819634 0.57784459 0.41239173 0.44347679 0.44914985
0.44657815 0.42307188 0.48984995 0.43725032]
mean value: 0.47439621187003356
key: train_mcc
value: [0.51056428 0.50670521 0.50076 0.51717766 0.51649949 0.52388132
0.51370782 0.52259615 0.50931211 0.51516634]
mean value: 0.5136370393257916
key: test_fscore
value: [0.65333333 0.67123288 0.69387755 0.56115108 0.58571429 0.5625
0.59863946 0.56521739 0.625 0.58156028]
mean value: 0.609822625669165
key: train_fscore
value: [0.63507109 0.63608087 0.62951334 0.64330218 0.64335664 0.64852255
0.63772691 0.64914992 0.6375682 0.63937008]
mean value: 0.6399661794313108
key: test_precision
value: [0.68055556 0.72058824 0.73913043 0.63934426 0.66129032 0.72
0.64705882 0.66101695 0.69230769 0.66129032]
mean value: 0.6822582598078301
key: train_precision
value: [0.71785714 0.70517241 0.70598592 0.71453287 0.71256454 0.71896552
0.71886121 0.71307301 0.70761246 0.71858407]
mean value: 0.7133209147848404
key: test_recall
value: [0.62820513 0.62820513 0.65384615 0.5 0.52564103 0.46153846
0.55696203 0.49367089 0.56962025 0.51898734]
mean value: 0.5536676403765013
key: train_recall
value: [0.5694051 0.57932011 0.56798867 0.58498584 0.58640227 0.59065156
0.57304965 0.59574468 0.58014184 0.57588652]
mean value: 0.5803576236111948
key: test_accuracy
value: [0.80228137 0.81749049 0.82889734 0.76806084 0.77946768 0.78707224
0.7756654 0.77186312 0.79467681 0.7756654 ]
mean value: 0.7901140684410646
key: train_accuracy
value: [0.80481622 0.80228137 0.80059147 0.80650613 0.80608365 0.80904098
0.80608365 0.80819603 0.8035488 0.80650613]
mean value: 0.8053654414871145
key: test_roc_auc
value: [0.7519404 0.76275121 0.77827443 0.69054054 0.70606376 0.69293139
0.71326362 0.69248762 0.7304623 0.70242845]
mean value: 0.7221143724796724
key: train_roc_auc
value: [0.73714084 0.73818504 0.73372341 0.74282404 0.74293021 0.74625895
0.73899173 0.74702998 0.73922856 0.74010933]
mean value: 0.740642210465735
key: test_jcc
value: [0.48514851 0.50515464 0.53125 0.39 0.41414141 0.39130435
0.42718447 0.39393939 0.45454545 0.41 ]
mean value: 0.44026682304985093
key: train_jcc
value: [0.46527778 0.4663626 0.45933562 0.47416762 0.4742268 0.47986191
0.46813441 0.4805492 0.46796339 0.46990741]
mean value: 0.4705786747672275
MCC on Blind test: 0.27
Accuracy on Blind test: 0.74
Running classifier: 12
Model_name: Logistic RegressionCV
Model func: LogisticRegressionCV(cv=3, random_state=42)
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', LogisticRegressionCV(cv=3, random_state=42))])
key: fit_time
value: [0.73673105 0.88648272 0.8190732 0.80960846 0.90032864 0.81108999
0.90703511 0.80606818 0.81044698 0.93546557]
mean value: 0.8422329902648926
key: score_time
value: [0.01349354 0.01577711 0.01656175 0.01651406 0.01353145 0.01659322
0.01696348 0.01663947 0.01652288 0.01659441]
mean value: 0.015919137001037597
key: test_mcc
value: [0. 0.53742235 0.55431233 0.38463987 0. 0.46905807
0. 0.38560845 0.49587233 0.41462633]
mean value: 0.3241539736603944
key: train_mcc
value: [0. 0.49779154 0.49116126 0.50910701 0. 0.51419366
0. 0.5070236 0.50476547 0.51249382]
mean value: 0.35365363700941665
key: test_fscore
value: [0. 0.66206897 0.67132867 0.54285714 0. 0.57142857
0. 0.52631579 0.62411348 0.56115108]
mean value: 0.41592636949193074
key: train_fscore
value: [0. 0.62618297 0.61953932 0.63341251 0. 0.63650794
0. 0.63422292 0.63083004 0.63484487]
mean value: 0.44155405568208617
key: test_precision
value: [0. 0.71641791 0.73846154 0.61290323 0. 0.75
0. 0.64814815 0.70967742 0.65 ]
mean value: 0.48256082422187385
key: train_precision
value: [0. 0.70640569 0.70524412 0.71813285 0. 0.72382671
0. 0.71001757 0.7125 0.72282609]
mean value: 0.4998953047944325
key: test_recall
value: [0. 0.61538462 0.61538462 0.48717949 0. 0.46153846
0. 0.44303797 0.55696203 0.49367089]
mean value: 0.36731580655631285
key: train_recall
value: [0. 0.56232295 0.55240793 0.56657224 0. 0.56798867
0. 0.57304965 0.56595745 0.56595745]
mean value: 0.39542563237096423
key: test_accuracy
value: [0.70342205 0.81368821 0.82129278 0.75665399 0.70342205 0.79467681
0.69961977 0.76045627 0.79847909 0.76806084]
mean value: 0.761977186311787
key: train_accuracy
value: [0.70173215 0.79974651 0.79763414 0.80439375 0.70173215 0.80650613
0.70215463 0.80312632 0.80270384 0.80608365]
mean value: 0.772581326573722
key: test_roc_auc
value: [0.5 0.75634096 0.76174636 0.67872488 0.5 0.6983368
0.5 0.66988855 0.72956797 0.68977023]
mean value: 0.6484375742534796
key: train_roc_auc
value: [0.5 0.7314926 0.72713714 0.73602543 0.5 0.73793774
0.5 0.73688583 0.7345431 0.73694984]
mean value: 0.6640971691951011
key: test_jcc
value: [0. 0.49484536 0.50526316 0.37254902 0. 0.4
0. 0.35714286 0.45360825 0.39 ]
mean value: 0.297340864289286
key: train_jcc
value: [0. 0.45579793 0.44879171 0.46349942 0. 0.46682189
0. 0.46436782 0.46073903 0.46503497]
mean value: 0.3225052765713964
MCC on Blind test: 0.27
Accuracy on Blind test: 0.74
Running classifier: 13
Model_name: MLP
Model func: MLPClassifier(max_iter=500, random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', MLPClassifier(max_iter=500, random_state=42))])
key: fit_time
value: [4.33079219 2.2537477 3.84267497 7.0162499 4.68141985 3.33273602
6.7529695 3.17223239 5.96025872 4.27506018]
mean value: 4.561814141273499
key: score_time
value: [0.01348472 0.01347852 0.01347613 0.01357293 0.01347971 0.01862359
0.01413321 0.01366806 0.01911831 0.01387167]
mean value: 0.014690685272216796
key: test_mcc
value: [0.40440279 0.52888168 0.47144837 0.38861146 0.3689725 0.468994
0.38003689 0.40883441 0.44563016 0.4290951 ]
mean value: 0.4294907356164722
key: train_mcc
value: [0.54379699 0.54566087 0.59860029 0.68248684 0.6308091 0.59461343
0.65326532 0.56533223 0.63723923 0.58263774]
mean value: 0.6034442032709686
key: test_fscore
value: [0.52380952 0.64233577 0.62820513 0.55782313 0.53521127 0.61111111
0.52941176 0.54814815 0.58571429 0.56296296]
mean value: 0.5724733087937734
key: train_fscore
value: [0.6 0.65024631 0.71206514 0.77038145 0.73126419 0.71029412
0.73642173 0.68006182 0.71617162 0.67114094]
mean value: 0.6978047310604498
key: test_precision
value: [0.6875 0.74576271 0.62820513 0.5942029 0.59375 0.66666667
0.63157895 0.66071429 0.67213115 0.67857143]
mean value: 0.6559083214482044
key: train_precision
value: [0.88186813 0.7734375 0.74573643 0.81616482 0.78536585 0.73853211
0.84277879 0.74702886 0.85601578 0.82135524]
mean value: 0.8008283518606302
key: test_recall
value: [0.42307692 0.56410256 0.62820513 0.52564103 0.48717949 0.56410256
0.4556962 0.46835443 0.51898734 0.48101266]
mean value: 0.5116358325219085
key: train_recall
value: [0.45467422 0.56090652 0.68130312 0.72946176 0.68413598 0.68413598
0.65390071 0.62411348 0.61560284 0.56737589]
mean value: 0.6255610471540796
key: test_accuracy
value: [0.77186312 0.81368821 0.77946768 0.75285171 0.74904943 0.78707224
0.75665399 0.76806084 0.77946768 0.7756654 ]
mean value: 0.7733840304182509
key: train_accuracy
value: [0.8191804 0.82002535 0.83565695 0.87029996 0.85002112 0.83354457
0.86058302 0.82509506 0.85466836 0.83438952]
mean value: 0.8403464300802703
key: test_roc_auc
value: [0.67099792 0.74151074 0.73572419 0.68714484 0.67331947 0.72259182
0.67078288 0.68254678 0.70514584 0.69159329]
mean value: 0.6981357776005547
key: train_roc_auc
value: [0.7143931 0.74553453 0.79128371 0.82981215 0.80233289 0.79059297
0.80107791 0.76723123 0.78583993 0.75751466]
mean value: 0.7785613082086863
key: test_jcc
value: [0.35483871 0.47311828 0.45794393 0.38679245 0.36538462 0.44
0.36 0.37755102 0.41414141 0.39175258]
mean value: 0.40215229945649267
key: train_jcc
value: [0.42857143 0.48175182 0.55287356 0.62652068 0.57637232 0.55074116
0.58280657 0.51522248 0.55784062 0.50505051]
mean value: 0.5377751154373915
MCC on Blind test: 0.29
Accuracy on Blind test: 0.74
Running classifier: 14
Model_name: Multinomial
Model func: MultinomialNB()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', MultinomialNB())])
key: fit_time
value: [0.02673459 0.02333736 0.02253413 0.02257562 0.02241206 0.0224576
0.0231638 0.02241945 0.0227015 0.0222733 ]
mean value: 0.023060941696166994
key: score_time
value: [0.01310134 0.01308513 0.01315904 0.01297092 0.01308918 0.01300979
0.01306653 0.01313424 0.01304984 0.01315618]
mean value: 0.013082218170166016
key: test_mcc
value: [0.20397133 0.19210109 0.29269852 0.24018515 0.22314098 0.14258177
0.18611172 0.27917271 0.1516845 0.14951656]
mean value: 0.20611643173064403
key: train_mcc
value: [0.20424093 0.21090525 0.21102685 0.20650484 0.20096048 0.2109867
0.19723899 0.21812855 0.21757248 0.21267115]
mean value: 0.20902362364598925
key: test_fscore
value: [0.45121951 0.42857143 0.51219512 0.46835443 0.44736842 0.375
0.42580645 0.48684211 0.42774566 0.39473684]
mean value: 0.44178399778713573
key: train_fscore
value: [0.43919886 0.43909555 0.44380816 0.44661549 0.4356578 0.4532967
0.43851019 0.43043812 0.45986395 0.44897959]
mean value: 0.4435464413634301
key: test_precision
value: [0.43023256 0.43421053 0.48837209 0.4625 0.45945946 0.40909091
0.43421053 0.50684932 0.39361702 0.4109589 ]
mean value: 0.4429501312799416
key: train_precision
value: [0.44364162 0.45263158 0.44862518 0.44016506 0.44233577 0.44
0.43454039 0.46979866 0.44183007 0.44553073]
mean value: 0.4459099045970024
key: test_recall
value: [0.47435897 0.42307692 0.53846154 0.47435897 0.43589744 0.34615385
0.41772152 0.46835443 0.46835443 0.37974684]
mean value: 0.44264849074975665
key: train_recall
value: [0.43484419 0.42634561 0.43909348 0.45325779 0.42917847 0.4674221
0.44255319 0.39716312 0.47943262 0.45248227]
mean value: 0.44217728487332486
key: test_accuracy
value: [0.65779468 0.66539924 0.69581749 0.68060837 0.68060837 0.65779468
0.66159696 0.70342205 0.62357414 0.65019011]
mean value: 0.6676806083650191
key: train_accuracy
value: [0.66877905 0.67511618 0.67173638 0.66497676 0.66835657 0.66370934
0.66244191 0.6869455 0.66455429 0.66920152]
mean value: 0.6695817490494298
key: test_roc_auc
value: [0.60474705 0.59532225 0.65031185 0.62096327 0.60984061 0.56767152
0.59201293 0.63635113 0.57928591 0.57302559]
mean value: 0.6029532112973224
key: train_roc_auc
value: [0.60152806 0.60360026 0.6048568 0.60411234 0.59959827 0.60728119
0.5991346 0.60351537 0.61125662 0.60680672]
mean value: 0.6041690218638992
key: test_jcc
value: [0.29133858 0.27272727 0.3442623 0.30578512 0.28813559 0.23076923
0.2704918 0.32173913 0.27205882 0.24590164]
mean value: 0.28432094950300624
key: train_jcc
value: [0.28139322 0.28130841 0.28518859 0.28751123 0.27849265 0.29307282
0.28082808 0.27424094 0.29858657 0.28947368]
mean value: 0.2850096202737361
MCC on Blind test: 0.23
Accuracy on Blind test: 0.72
Running classifier: 15
Model_name: Naive Bayes
Model func: BernoulliNB()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', BernoulliNB())])
key: fit_time
value: [0.02430153 0.02428341 0.02441669 0.0246284 0.0245142 0.02469468
0.02447581 0.02432632 0.02470875 0.0246079 ]
mean value: 0.024495768547058105
key: score_time
value: [0.01332068 0.01342702 0.01342511 0.01338768 0.01342034 0.01337624
0.01343846 0.01335359 0.013448 0.01345205]
mean value: 0.013404917716979981
key: test_mcc
value: [0.13092137 0.06420733 0.17064359 0.08831872 0.11740026 0.08823935
0.09819346 0.0992627 0.05707224 0.1828192 ]
mean value: 0.10970782205018796
key: train_mcc
value: [0.16478329 0.17613574 0.16102014 0.16465051 0.16855645 0.16626809
0.17283691 0.15283547 0.1572961 0.15131048]
mean value: 0.1635693192320974
key: test_fscore
value: [0.328125 0.2300885 0.37313433 0.25862069 0.30645161 0.288
0.30769231 0.28099174 0.28571429 0.31578947]
mean value: 0.29746079291198224
key: train_fscore
value: [0.33843384 0.33115061 0.33810376 0.32432432 0.3426009 0.35191638
0.35794961 0.31332083 0.34524847 0.32422587]
mean value: 0.3367274579671352
key: test_precision
value: [0.42 0.37142857 0.44642857 0.39473684 0.41304348 0.38297872
0.39215686 0.4047619 0.35185185 0.51428571]
mean value: 0.40916725202721
key: train_precision
value: [0.46419753 0.48760331 0.45873786 0.47411444 0.46699267 0.45701357
0.46188341 0.46260388 0.4479638 0.45292621]
mean value: 0.46340366775856634
key: test_recall
value: [0.26923077 0.16666667 0.32051282 0.19230769 0.24358974 0.23076923
0.25316456 0.21518987 0.24050633 0.2278481 ]
mean value: 0.23597857838364167
key: train_recall
value: [0.26628895 0.25070822 0.26770538 0.24645892 0.27053824 0.28611898
0.29219858 0.23687943 0.28085106 0.25248227]
mean value: 0.26502300444015836
key: test_accuracy
value: [0.6730038 0.66920152 0.68060837 0.6730038 0.6730038 0.66159696
0.65779468 0.66920152 0.63878327 0.70342205]
mean value: 0.6699619771863119
key: train_accuracy
value: [0.68948035 0.69792987 0.68736798 0.69370511 0.69032531 0.68567807
0.68779045 0.69074778 0.68272074 0.68652302]
mean value: 0.6892268694550063
key: test_roc_auc
value: [0.55623701 0.52387387 0.57647263 0.53399168 0.5488219 0.53700624
0.54234315 0.53966015 0.52514447 0.5677284 ]
mean value: 0.5451279495913507
key: train_roc_auc
value: [0.56782238 0.56936374 0.56672446 0.56513193 0.569646 0.57081385
0.57389712 0.5600763 0.56702 0.56156003]
mean value: 0.5672055812222606
key: test_jcc
value: [0.19626168 0.13 0.2293578 0.14851485 0.18095238 0.1682243
0.18181818 0.16346154 0.16666667 0.1875 ]
mean value: 0.17527573988574655
key: train_jcc
value: [0.20368364 0.19843049 0.20344456 0.19354839 0.20670996 0.21353066
0.21798942 0.18576196 0.20864067 0.19347826]
mean value: 0.20252180078060097
MCC on Blind test: 0.1
Accuracy on Blind test: 0.7
Running classifier: 16
Model_name: Passive Aggresive
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model',
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
key: fit_time
value: [0.03711629 0.03351736 0.03191185 0.04453802 0.0460515 0.03751326
0.03187895 0.05301237 0.04071546 0.04045105]
mean value: 0.03967061042785645
key: score_time
value: [0.01279163 0.01288223 0.01292491 0.01289678 0.01986098 0.01710844
0.01290154 0.0134809 0.01801753 0.01301551]
mean value: 0.014588046073913574
key: test_mcc
value: [0.37245113 0.25183369 0.24941665 0.17716674 0.0389866 0.23531392
0.40288001 0. 0.25139696 0.39897031]
mean value: 0.23784160096185233
key: train_mcc
value: [0.41208345 0.28881893 0.24330463 0.22829327 0.11526657 0.2226917
0.37640363 0.1063854 0.3504306 0.39499079]
mean value: 0.27386689578956414
key: test_fscore
value: [0.46551724 0.5198556 0.5177305 0.17777778 0.025 0.14285714
0.60550459 0. 0.2970297 0.60377358]
mean value: 0.33550461291679223
key: train_fscore
value: [0.47731755 0.5372036 0.51784298 0.21601942 0.04155125 0.17280813
0.58870968 0.03616134 0.39493136 0.5995829 ]
mean value: 0.3582128203528704
key: test_precision
value: [0.71052632 0.36180905 0.35784314 0.66666667 0.5 1.
0.47482014 0. 0.68181818 0.48120301]
mean value: 0.5234686498159044
key: train_precision
value: [0.78571429 0.37758621 0.3595815 0.75423729 0.9375 0.83950617
0.45660672 0.92857143 0.77272727 0.47403133]
mean value: 0.6686062203972838
key: test_recall
value: [0.34615385 0.92307692 0.93589744 0.1025641 0.01282051 0.07692308
0.83544304 0. 0.18987342 0.81012658]
mean value: 0.4232878935410581
key: train_recall
value: [0.3427762 0.9305949 0.92492918 0.12606232 0.02124646 0.09631728
0.82836879 0.01843972 0.26524823 0.81560284]
mean value: 0.4369585920077149
key: test_accuracy
value: [0.76425856 0.49429658 0.48288973 0.71863118 0.70342205 0.72623574
0.6730038 0.69961977 0.73003802 0.68060837]
mean value: 0.6673003802281369
key: train_accuracy
value: [0.77608787 0.5217575 0.48626954 0.72708069 0.70764681 0.72496831
0.65525982 0.70722433 0.75792142 0.67553866]
mean value: 0.6739754964089566
key: test_roc_auc
value: [0.64334719 0.61829522 0.61389466 0.54047124 0.50370755 0.53846154
0.71935195 0.5 0.57591497 0.71756329]
mean value: 0.5971007622816924
key: train_roc_auc
value: [0.65152055 0.63928902 0.61237428 0.55430148 0.51032221 0.54424533
0.70509896 0.50891902 0.61607778 0.71586399]
mean value: 0.6058012628934482
key: test_jcc
value: [0.30337079 0.35121951 0.3492823 0.09756098 0.01265823 0.07692308
0.43421053 0. 0.1744186 0.43243243]
mean value: 0.22320764391430128
key: train_jcc
value: [0.3134715 0.36724427 0.3493847 0.12108844 0.02121641 0.0945758
0.41714286 0.0184136 0.24605263 0.42814594]
mean value: 0.2376736141659775
MCC on Blind test: 0.19
Accuracy on Blind test: 0.43
Running classifier: 17
Model_name: QDA
Model func: QuadraticDiscriminantAnalysis()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', QuadraticDiscriminantAnalysis())])
key: fit_time
value: [0.06890297 0.10364747 0.05830121 0.05737972 0.05682445 0.0590663
0.05896473 0.05848122 0.05933833 0.06016874]
mean value: 0.06410751342773438
key: score_time
value: [0.01613188 0.01425767 0.01664186 0.01608992 0.01729941 0.01448679
0.01447415 0.0143373 0.0143218 0.01420999]
mean value: 0.015225076675415039
key: test_mcc
value: [ 0.1150102 0.12222663 -0.06805024 0.07125986 0.07125986 0.0180625
0.07771725 0.0001649 0.09121784 0.05403499]
mean value: 0.05529037821096737
key: train_mcc
value: [0.10868273 0.11041579 0.11297102 0.11547619 0.11297102 0.11041579
0.11285702 0.11618332 0.1145308 0.11781548]
mean value: 0.11323191737640673
key: test_fscore
value: [0.46846847 0.46987952 0.44776119 0.46341463 0.46341463 0.45645646
0.46846847 0.45783133 0.46884273 0.4652568 ]
mean value: 0.4629794226642828
key: train_fscore
value: [0.46925889 0.469571 0.47003995 0.47050983 0.47003995 0.469571
0.46953047 0.47015672 0.46984339 0.47047047]
mean value: 0.46989916599694304
key: test_precision
value: [0.30588235 0.30708661 0.29182879 0.304 0.304 0.29803922
0.30708661 0.30039526 0.30620155 0.30555556]
mean value: 0.30300759536083754
key: train_precision
value: [0.30655667 0.30682312 0.30722367 0.30762527 0.30722367 0.30682312
0.30678851 0.30732345 0.30705575 0.30759162]
mean value: 0.3071034860232819
key: test_recall
value: [1. 1. 0.96153846 0.97435897 0.97435897 0.97435897
0.98734177 0.96202532 1. 0.97468354]
mean value: 0.9808666017526777
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.3269962 0.33079848 0.29657795 0.33079848 0.33079848 0.31178707
0.3269962 0.31558935 0.31939163 0.3269962 ]
mean value: 0.32167300380228137
key: train_accuracy
value: [0.32530629 0.32615125 0.32741867 0.3286861 0.32741867 0.32615125
0.3269962 0.3286861 0.32784115 0.32953105]
mean value: 0.32741867342627795
key: test_roc_auc
value: [0.52162162 0.52432432 0.48887734 0.51690922 0.51690922 0.5033957
0.51541002 0.5000344 0.51358696 0.51179829]
mean value: 0.5112867086319205
key: train_roc_auc
value: [0.5192655 0.51986755 0.52077062 0.52167369 0.52077062 0.51986755
0.52075812 0.52196149 0.52135981 0.52256318]
mean value: 0.5208858132089538
key: test_jcc
value: [0.30588235 0.30708661 0.28846154 0.3015873 0.3015873 0.29571984
0.30588235 0.296875 0.30620155 0.30314961]
mean value: 0.3012433462736509
key: train_jcc
value: [0.30655667 0.30682312 0.30722367 0.30762527 0.30722367 0.30682312
0.30678851 0.30732345 0.30705575 0.30759162]
mean value: 0.3071034860232819
MCC on Blind test: 0.09
Accuracy on Blind test: 0.35
Running classifier: 18
Model_name: Random Forest
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model',
RandomForestClassifier(n_estimators=1000, random_state=42))])
key: fit_time
value: [7.40919328 7.43323874 7.38366652 7.39183664 7.30058074 7.47701049
7.43835711 7.43318129 7.3354075 7.38053226]
mean value: 7.398300457000732
key: score_time
value: [0.12630749 0.13165593 0.12785149 0.12744379 0.12720776 0.12901258
0.1277597 0.12734485 0.13595152 0.12760901]
mean value: 0.1288144111633301
key: test_mcc
value: [0.48488848 0.55552181 0.50892531 0.37636053 0.41186242 0.40671938
0.47885221 0.38148349 0.51013783 0.41518928]
mean value: 0.4529940737086856
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.60150376 0.64615385 0.62773723 0.51162791 0.54545455 0.53125
0.59701493 0.48333333 0.62222222 0.54545455]
mean value: 0.5711752310644238
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.72727273 0.80769231 0.72881356 0.64705882 0.66666667 0.68
0.72727273 0.70731707 0.75 0.67924528]
mean value: 0.7121339167945474
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.51282051 0.53846154 0.55128205 0.42307692 0.46153846 0.43589744
0.50632911 0.36708861 0.53164557 0.4556962 ]
mean value: 0.47838364167478087
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.79847909 0.82509506 0.80608365 0.76045627 0.77186312 0.77186312
0.79467681 0.76425856 0.80608365 0.77186312]
mean value: 0.7870722433460077
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: [0.71586972 0.74220374 0.73239778 0.66288981 0.68212058 0.67470547
0.71240369 0.65093561 0.72777931 0.68165245]
mean value: 0.6982958161370381
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.43010753 0.47727273 0.45744681 0.34375 0.375 0.36170213
0.42553191 0.31868132 0.4516129 0.375 ]
mean value: 0.4016105327125402
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: 0.31
Accuracy on Blind test: 0.75
Running classifier: 19
Model_name: Random Forest2
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
n_estimators=1000, n_jobs=10, oob_score=True,
random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model',
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
n_estimators=1000, n_jobs=10,
oob_score=True, random_state=42))])
key: fit_time
value: [1.82287669 1.80299139 1.89801621 1.8409524 1.81344414 1.89064193
1.84345198 1.82935882 1.8130846 1.80522966]
mean value: 1.8360047817230225
key: score_time
value: [0.36049843 0.36330676 0.33833313 0.34528875 0.21095634 0.30681586
0.3696053 0.21399665 0.35363913 0.36342788]
mean value: 0.3225868225097656
key: test_mcc
value: [0.51633797 0.50076369 0.52465714 0.3803635 0.41847191 0.45655394
0.43385398 0.35673628 0.53039852 0.45209767]
mean value: 0.4570234591638388
key: train_mcc
value: [0.80308384 0.80410418 0.79797943 0.79781679 0.80658208 0.80002174
0.79471317 0.80580224 0.80799806 0.79794937]
mean value: 0.8016050900977321
key: test_fscore
value: [0.62686567 0.58536585 0.6259542 0.5 0.54263566 0.55284553
0.55384615 0.45762712 0.63703704 0.55555556]
mean value: 0.5637732776226437
key: train_fscore
value: [0.84430177 0.84522855 0.83964545 0.83990346 0.84896661 0.84151247
0.83656958 0.84707766 0.84867894 0.83912692]
mean value: 0.8431011408802457
key: test_precision
value: [0.75 0.8 0.77358491 0.67391304 0.68627451 0.75555556
0.70588235 0.69230769 0.76785714 0.74468085]
mean value: 0.7350056053667957
key: train_precision
value: [0.97407407 0.974122 0.97383178 0.97206704 0.9673913 0.97392924
0.97363465 0.97242647 0.97426471 0.97556391]
mean value: 0.973130516387697
key: test_recall
value: [0.53846154 0.46153846 0.52564103 0.3974359 0.44871795 0.43589744
0.4556962 0.34177215 0.5443038 0.44303797]
mean value: 0.45925024342745857
key: train_recall
value: [0.74504249 0.74645892 0.73796034 0.73937677 0.75637394 0.7407932
0.73333333 0.75035461 0.75177305 0.73617021]
mean value: 0.7437636871396138
key: test_accuracy
value: [0.80988593 0.80608365 0.81368821 0.76425856 0.7756654 0.79087452
0.77946768 0.75665399 0.81368821 0.78707224]
mean value: 0.7897338403041825
key: train_accuracy
value: [0.91803971 0.91846219 0.91592733 0.91592733 0.91972962 0.91677229
0.91465991 0.91930714 0.92015209 0.91592733]
mean value: 0.917490494296578
key: test_roc_auc
value: [0.73139293 0.70644491 0.73038808 0.65817741 0.68111573 0.68821899
0.68708723 0.63827738 0.73682581 0.68891029]
mean value: 0.6946838761203098
key: train_roc_auc
value: [0.86830692 0.86901513 0.86476584 0.86517303 0.87276855 0.86618227
0.86245487 0.87066467 0.87167473 0.86417416]
mean value: 0.8675180173911243
key: test_jcc
value: [0.45652174 0.4137931 0.45555556 0.33333333 0.37234043 0.38202247
0.38297872 0.2967033 0.4673913 0.38461538]
mean value: 0.39452553379803895
key: train_jcc
value: [0.73055556 0.73194444 0.72361111 0.72399445 0.73756906 0.72638889
0.71905424 0.73472222 0.73713491 0.72284123]
mean value: 0.7287816112371679
MCC on Blind test: 0.33
Accuracy on Blind test: 0.76
Running classifier: 20
Model_name: Ridge Classifier
Model func: RidgeClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', RidgeClassifier(random_state=42))])
key: fit_time
value: [0.03667712 0.05546021 0.06430936 0.0387404 0.04058552 0.04140234
0.03888202 0.03683639 0.03822231 0.03893661]
mean value: 0.04300522804260254
key: score_time
value: [0.02017307 0.02778125 0.02514958 0.02578521 0.02422738 0.02570701
0.02638769 0.02651 0.02669811 0.02393436]
mean value: 0.025235366821289063
key: test_mcc
value: [0.5130914 0.54357424 0.5350134 0.40301004 0.43211514 0.42595455
0.42901186 0.39451236 0.47392309 0.38667607]
mean value: 0.45368821564897044
key: train_mcc
value: [0.50654196 0.50740811 0.52051062 0.51540147 0.52190151 0.52281702
0.51904775 0.52364165 0.51245072 0.52254233]
mean value: 0.5172263143693743
key: test_fscore
value: [0.64864865 0.66197183 0.65734266 0.54135338 0.57553957 0.53968254
0.57746479 0.53030303 0.60431655 0.54285714]
mean value: 0.5879480137118889
key: train_fscore
value: [0.62691378 0.63116057 0.64069952 0.63758921 0.64409449 0.64051241
0.63961814 0.64336775 0.6305578 0.64051241]
mean value: 0.6375026089294349
key: test_precision
value: [0.68571429 0.734375 0.72307692 0.65454545 0.6557377 0.70833333
0.65079365 0.66037736 0.7 0.62295082]
mean value: 0.6795904530544379
key: train_precision
value: [0.7271028 0.7192029 0.73007246 0.72432432 0.7251773 0.73664825
0.72826087 0.73104693 0.73308271 0.73529412]
mean value: 0.7290212671193563
key: test_recall
value: [0.61538462 0.6025641 0.6025641 0.46153846 0.51282051 0.43589744
0.51898734 0.44303797 0.53164557 0.48101266]
mean value: 0.5205452775073028
key: train_recall
value: [0.5509915 0.56232295 0.57082153 0.5694051 0.57932011 0.56657224
0.57021277 0.57446809 0.55319149 0.56737589]
mean value: 0.5664681654712395
key: test_accuracy
value: [0.80228137 0.81749049 0.81368821 0.76806084 0.7756654 0.77946768
0.77186312 0.76425856 0.79087452 0.75665399]
mean value: 0.7840304182509505
key: train_accuracy
value: [0.80439375 0.80397127 0.80904098 0.8069286 0.80904098 0.81030841
0.8086185 0.81030841 0.8069286 0.81030841]
mean value: 0.8079847908745247
key: test_roc_auc
value: [0.74823285 0.75533611 0.7526334 0.67941788 0.6996535 0.68011088
0.69971106 0.67260594 0.71690974 0.67800633]
mean value: 0.708261769188434
key: train_roc_auc
value: [0.73154632 0.73450283 0.74055827 0.73864596 0.74300142 0.74023976
0.73998003 0.74240853 0.73387613 0.74036664]
mean value: 0.7385125892244688
key: test_jcc
value: [0.48 0.49473684 0.48958333 0.37113402 0.4040404 0.36956522
0.40594059 0.36082474 0.43298969 0.37254902]
mean value: 0.4181363864145801
key: train_jcc
value: [0.45657277 0.46109175 0.47134503 0.46798603 0.47502904 0.47114252
0.47017544 0.47423888 0.46044864 0.47114252]
mean value: 0.4679172617206403
MCC on Blind test: 0.25
Accuracy on Blind test: 0.73
Running classifier: 21
Model_name: Ridge ClassifierCV
Model func: RidgeClassifierCV(cv=3)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', RidgeClassifierCV(cv=3))])
key: fit_time
value: [0.18392992 0.12751102 0.27314997 0.18074131 0.17915106 0.18539882
0.22811103 0.16719675 0.10775495 0.18047953]
mean value: 0.1813424348831177
key: score_time
value: [0.01298881 0.02016687 0.02849007 0.0251596 0.02123976 0.02758551
0.03386307 0.01344585 0.0206573 0.02007842]
mean value: 0.022367525100708007
key: test_mcc
value: [0.50477385 0.51986246 0.55228458 0.38254043 0.48018637 0.44750631
0.43725032 0.40614197 0.49168614 0.40016775]
mean value: 0.46224001702814216
key: train_mcc
value: [0.50791178 0.49557858 0.50732004 0.5153734 0.50524633 0.5202033
0.5189479 0.51761255 0.5015071 0.51588406]
mean value: 0.5105585053763914
key: test_fscore
value: [0.63380282 0.63768116 0.66666667 0.52631579 0.60869565 0.55555556
0.58156028 0.54135338 0.61313869 0.54411765]
mean value: 0.5908887640528316
key: train_fscore
value: [0.62356792 0.61688312 0.62681745 0.63336019 0.62770216 0.63533225
0.63235294 0.63795853 0.62052117 0.63080685]
mean value: 0.6285302586253506
key: test_precision
value: [0.703125 0.73333333 0.74603175 0.63636364 0.7 0.72916667
0.66129032 0.66666667 0.72413793 0.64912281]
mean value: 0.694923810969472
key: train_precision
value: [0.73837209 0.72243346 0.72932331 0.73457944 0.72191529 0.74242424
0.74566474 0.72859745 0.72848948 0.74137931]
mean value: 0.733317881238351
key: test_recall
value: [0.57692308 0.56410256 0.6025641 0.44871795 0.53846154 0.44871795
0.51898734 0.4556962 0.53164557 0.46835443]
mean value: 0.5154170723790977
key: train_recall
value: [0.53966006 0.53824363 0.54957507 0.55665722 0.55524079 0.55524079
0.54893617 0.56737589 0.54042553 0.54893617]
mean value: 0.5500291322604625
key: test_accuracy
value: [0.80228137 0.80988593 0.82129278 0.76045627 0.79467681 0.78707224
0.7756654 0.76806084 0.79847909 0.76425856]
mean value: 0.788212927756654
key: train_accuracy
value: [0.80566117 0.80059147 0.80481622 0.80777355 0.8035488 0.80988593
0.80988593 0.80819603 0.80312632 0.8086185 ]
mean value: 0.8062103929024081
key: test_roc_auc
value: [0.73711019 0.73880804 0.75803881 0.67030492 0.72058212 0.68922384
0.70242845 0.67893506 0.72234452 0.67982939]
mean value: 0.7097605338393727
key: train_roc_auc
value: [0.72919186 0.72517238 0.73144015 0.73558328 0.73216585 0.7366812
0.73475689 0.73886243 0.72749315 0.73385437]
mean value: 0.7325201573425192
key: test_jcc
value: [0.46391753 0.46808511 0.5 0.35714286 0.4375 0.38461538
0.41 0.37113402 0.44210526 0.37373737]
mean value: 0.4208237531428242
key: train_jcc
value: [0.4530321 0.44600939 0.45647059 0.4634434 0.45740957 0.46555819
0.46236559 0.46838407 0.4498229 0.46071429]
mean value: 0.45832100982280766
MCC on Blind test: 0.25
Accuracy on Blind test: 0.73
Running classifier: 22
Model_name: SVC
Model func: SVC(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', SVC(random_state=42))])
key: fit_time
value: [0.31986976 0.31140661 0.29563594 0.30532598 0.30202127 0.29620934
0.30111408 0.30566716 0.30478692 0.30286217]
mean value: 0.30448992252349855
key: score_time
value: [0.08396244 0.08340526 0.07117009 0.08412647 0.08466125 0.07585144
0.06858587 0.0831387 0.08528805 0.07590079]
mean value: 0.07960903644561768
key: test_mcc
value: [0.46901258 0.50269469 0.46905807 0.3077797 0.36207082 0.34401792
0.37687801 0.39900656 0.41740351 0.35073724]
mean value: 0.39986590922027376
key: train_mcc
value: [0.50974033 0.51813543 0.50894827 0.52818877 0.53789316 0.5070469
0.51387894 0.54021053 0.48568199 0.51417209]
mean value: 0.5163896421670848
key: test_fscore
value: [0.59854015 0.60465116 0.57142857 0.43333333 0.5112782 0.41818182
0.52238806 0.51968504 0.5203252 0.47154472]
mean value: 0.5171356244979302
key: train_fscore
value: [0.60314685 0.61028771 0.60606061 0.63087248 0.63443596 0.59717314
0.60526316 0.6391926 0.57805531 0.60869565]
mean value: 0.6113183473699516
key: test_precision
value: [0.69491525 0.76470588 0.75 0.61904762 0.61818182 0.71875
0.63636364 0.6875 0.72727273 0.65909091]
mean value: 0.6875827846546939
key: train_precision
value: [0.78767123 0.79365079 0.77951002 0.77366255 0.79069767 0.79342723
0.79310345 0.78512397 0.77884615 0.78651685]
mean value: 0.7862209927701852
key: test_recall
value: [0.52564103 0.5 0.46153846 0.33333333 0.43589744 0.29487179
0.44303797 0.41772152 0.40506329 0.36708861]
mean value: 0.4184193443687114
key: train_recall
value: [0.48866856 0.49575071 0.49575071 0.5325779 0.52974504 0.47875354
0.4893617 0.53900709 0.45957447 0.4964539 ]
mean value: 0.500564362204408
key: test_accuracy
value: [0.79087452 0.80608365 0.79467681 0.74144487 0.75285171 0.75665399
0.75665399 0.76806084 0.7756654 0.75285171]
mean value: 0.7695817490494297
key: train_accuracy
value: [0.80819603 0.81115336 0.80777355 0.81411069 0.81791297 0.80735108
0.80988593 0.81875792 0.80016899 0.80988593]
mean value: 0.8105196451204056
key: test_roc_auc
value: [0.71417186 0.71756757 0.6983368 0.62342342 0.66119196 0.62311157
0.66717116 0.66809989 0.66992295 0.64278343]
mean value: 0.6685780623136154
key: train_roc_auc
value: [0.71633909 0.72048222 0.71807403 0.73317637 0.7350712 0.7128867
0.71760504 0.73821594 0.70210974 0.71964693]
mean value: 0.7213607254091653
key: test_jcc
value: [0.42708333 0.43333333 0.4 0.27659574 0.34343434 0.26436782
0.35353535 0.35106383 0.35164835 0.30851064]
mean value: 0.35095727441426267
key: train_jcc
value: [0.43178974 0.4391468 0.43478261 0.46078431 0.46459627 0.4256927
0.43396226 0.4697157 0.40652447 0.4375 ]
mean value: 0.4404494857894855
MCC on Blind test: 0.25
Accuracy on Blind test: 0.73
Running classifier: 23
Model_name: Stochastic GDescent
Model func: SGDClassifier(n_jobs=10, random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', SGDClassifier(n_jobs=10, random_state=42))])
key: fit_time
value: [0.05842829 0.09799361 0.08701038 0.09880447 0.10781527 0.08456612
0.09003544 0.1020596 0.10651445 0.07889628]
mean value: 0.09121239185333252
key: score_time
value: [0.01149511 0.01124096 0.0114572 0.01130176 0.01151252 0.01208425
0.01206446 0.01147723 0.0118351 0.01155615]
mean value: 0.011602473258972169
key: test_mcc
value: [0.40455383 0.48322802 0.44412904 0.41239173 0.46352173 0.31567502
0.43483932 0.29694893 0.21445316 0.46456695]
mean value: 0.3934307736654106
key: train_mcc
value: [0.39594179 0.49903863 0.4572869 0.51664968 0.53278563 0.45194341
0.48003129 0.48108277 0.25006567 0.40750221]
mean value: 0.44723279806069527
key: test_fscore
value: [0.60176991 0.62585034 0.53333333 0.56115108 0.63030303 0.37735849
0.61627907 0.43548387 0.1978022 0.63926941]
mean value: 0.5218600729909647
key: train_fscore
value: [0.59834065 0.63636364 0.54826958 0.64506173 0.67830759 0.51923077
0.64779874 0.58954584 0.19647355 0.60530265]
mean value: 0.5664694747155247
key: test_precision
value: [0.45945946 0.66666667 0.76190476 0.63934426 0.59770115 0.71428571
0.56989247 0.6 0.75 0.5 ]
mean value: 0.6259254487155251
key: train_precision
value: [0.4564408 0.68403909 0.76785714 0.70847458 0.6449553 0.80838323
0.5819209 0.74458874 0.87640449 0.4675425 ]
mean value: 0.6740606791696186
key: test_recall
value: [0.87179487 0.58974359 0.41025641 0.5 0.66666667 0.25641026
0.67088608 0.34177215 0.11392405 0.88607595]
mean value: 0.5307530022719896
key: train_recall
value: [0.86827195 0.59490085 0.42634561 0.59206799 0.71529745 0.38243626
0.73049645 0.48794326 0.1106383 0.85815603]
mean value: 0.576655415586764
key: test_accuracy
value: [0.65779468 0.79087452 0.78707224 0.76806084 0.76806084 0.74904943
0.74904943 0.7338403 0.72243346 0.69961977]
mean value: 0.7425855513307985
key: train_accuracy
value: [0.65230249 0.79721166 0.79045205 0.80566117 0.79763414 0.78876215
0.7634136 0.79763414 0.7304605 0.66666667]
mean value: 0.7590198563582594
key: test_roc_auc
value: [0.71968122 0.73270963 0.67810118 0.69054054 0.73873874 0.60658351
0.72674739 0.62197303 0.54880985 0.75282058]
mean value: 0.6816705669409301
key: train_roc_auc
value: [0.71438884 0.73905187 0.68577967 0.74425796 0.7739642 0.67195263
0.75393655 0.70847223 0.55200988 0.72179763]
mean value: 0.7065611455249907
key: test_jcc
value: [0.43037975 0.45544554 0.36363636 0.39 0.46017699 0.23255814
0.44537815 0.27835052 0.1097561 0.46979866]
mean value: 0.36354802077151066
key: train_jcc
value: [0.42688022 0.46666667 0.37766625 0.476082 0.51321138 0.35064935
0.47906977 0.41798299 0.10893855 0.43400287]
mean value: 0.4051150048691243
MCC on Blind test: 0.16
Accuracy on Blind test: 0.72
Running classifier: 24
Model_name: XGBoost
Model func: /home/tanu/git/LSHTM_analysis/scripts/ml/ml_functions/MultClfs_logo_skf.py:419: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
scoresDF_CV['source_data'] = 'CV'
/home/tanu/git/LSHTM_analysis/scripts/ml/ml_functions/MultClfs_logo_skf.py:446: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
scoresDF_BT['source_data'] = 'BT'
XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
colsample_bynode=None, colsample_bytree=None,
enable_categorical=False, gamma=None, gpu_id=None,
importance_type=None, interaction_constraints=None,
learning_rate=None, max_delta_step=None, max_depth=None,
min_child_weight=None, missing=nan, monotone_constraints=None,
n_estimators=100, n_jobs=None, num_parallel_tree=None,
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
scale_pos_weight=None, subsample=None, tree_method=None,
use_label_encoder=False, validate_parameters=None, verbosity=0)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', '...
interaction_constraints=None, learning_rate=None,
max_delta_step=None, max_depth=None,
min_child_weight=None, missing=nan,
monotone_constraints=None, n_estimators=100,
n_jobs=None, num_parallel_tree=None,
predictor=None, random_state=42, reg_alpha=None,
reg_lambda=None, scale_pos_weight=None,
subsample=None, tree_method=None,
use_label_encoder=False,
validate_parameters=None, verbosity=0))])
key: fit_time
value: [0.43344212 0.33086538 0.35976291 0.41917515 0.679847 0.35258722
0.33991528 0.48529649 0.33304524 0.36988974]
mean value: 0.41038265228271487
key: score_time
value: [0.01188159 0.01170635 0.01204205 0.0130949 0.01224422 0.01207519
0.01245761 0.01181293 0.01168418 0.01269698]
mean value: 0.012169599533081055
key: test_mcc
value: [0.44979328 0.54357424 0.56720934 0.43022174 0.44189513 0.49600301
0.46528152 0.26876789 0.56252444 0.46283882]
mean value: 0.4688109405717009
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.59722222 0.66197183 0.68493151 0.59060403 0.59310345 0.6119403
0.6 0.45714286 0.68456376 0.5942029 ]
mean value: 0.6075682847769258
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.65151515 0.734375 0.73529412 0.61971831 0.64179104 0.73214286
0.68852459 0.52459016 0.72857143 0.69491525]
mean value: 0.6751437917847418
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.55128205 0.6025641 0.64102564 0.56410256 0.55128205 0.52564103
0.53164557 0.40506329 0.64556962 0.51898734]
mean value: 0.5537163258682246
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.77946768 0.81749049 0.82509506 0.76806084 0.7756654 0.80228137
0.78707224 0.71102662 0.82129278 0.78707224]
mean value: 0.7874524714828898
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: [0.71347886 0.75533611 0.77186417 0.70907831 0.71077616 0.72227997
0.71419235 0.6237273 0.77115438 0.71058063]
mean value: 0.7202468233336423
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.42574257 0.49473684 0.52083333 0.41904762 0.42156863 0.44086022
0.42857143 0.2962963 0.52040816 0.42268041]
mean value: 0.43907455117525507
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: 0.3
Accuracy on Blind test: 0.74
Extracting tts_split_name: logo_skf_BT_rpob
Total cols in each df:
CV df: 8
metaDF: 17
Adding column: Model_name
Total cols in bts df:
BT_df: 8
First proceeding to rowbind CV and BT dfs:
Final output should have: 25 columns
Combinig 2 using pd.concat by row ~ rowbind
Checking Dims of df to combine:
Dim of CV: (24, 8)
Dim of BT: (24, 8)
8
Number of Common columns: 8
These are: ['Precision', 'Accuracy', 'source_data', 'F1', 'Recall', 'MCC', 'ROC_AUC', 'JCC']
Concatenating dfs with different resampling methods [WF]:
Split type: logo_skf_BT_rpob
No. of dfs combining: 2
PASS: 2 dfs successfully combined
nrows in combined_df_wf: 48
ncols in combined_df_wf: 8
PASS: proceeding to merge metadata with CV and BT dfs
Adding column: Model_name
=========================================================
SUCCESS: Ran multiple classifiers
=======================================================
BTS gene: pnca
Total genes: 6
Training on: 4
Training on genes: ['katg', 'gid', 'rpob', 'embb']
Omitted genes: ['alr', 'pnca']
Blind test gene: pnca
/home/tanu/git/Data/ml_combined/5genes_logo_skf_BT_pnca.csv
Training data dim: (3338, 171)
Training Target dim: (3338,)
Checked training df does NOT have Target var
TEST data dim: (424, 171)
TEST Target dim: (424,)
==============================================================
Running several classification models (n): 24
List of models:
('AdaBoost Classifier', AdaBoostClassifier(random_state=42))
('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42))
('Decision Tree', DecisionTreeClassifier(random_state=42))
('Extra Tree', ExtraTreeClassifier(random_state=42))
('Extra Trees', ExtraTreesClassifier(random_state=42))
('Gradient Boosting', GradientBoostingClassifier(random_state=42))
('Gaussian NB', GaussianNB())
('Gaussian Process', GaussianProcessClassifier(random_state=42))
('K-Nearest Neighbors', KNeighborsClassifier())
('LDA', LinearDiscriminantAnalysis())
('Logistic Regression', LogisticRegression(random_state=42))
('Logistic RegressionCV', LogisticRegressionCV(cv=3, random_state=42))
('MLP', MLPClassifier(max_iter=500, random_state=42))
('Multinomial', MultinomialNB())
('Naive Bayes', BernoulliNB())
('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42))
('QDA', QuadraticDiscriminantAnalysis())
('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42))
('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
n_estimators=1000, n_jobs=10, oob_score=True,
random_state=42))
('Ridge Classifier', RidgeClassifier(random_state=42))
('Ridge ClassifierCV', RidgeClassifierCV(cv=3))
('SVC', SVC(random_state=42))
('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42))
('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
colsample_bynode=None, colsample_bytree=None,
enable_categorical=False, gamma=None, gpu_id=None,
importance_type=None, interaction_constraints=None,
learning_rate=None, max_delta_step=None, max_depth=None,
min_child_weight=None, missing=nan, monotone_constraints=None,
n_estimators=100, n_jobs=None, num_parallel_tree=None,
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
scale_pos_weight=None, subsample=None, tree_method=None,
use_label_encoder=False, validate_parameters=None, verbosity=0))
================================================================
Running classifier: 1
Model_name: AdaBoost Classifier
Model func: AdaBoostClassifier(random_state=42)
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', AdaBoostClassifier(random_state=42))])
key: fit_time
value: [0.80269051 0.74609613 0.75645041 0.76536942 0.72242522 0.79545379
0.82826662 0.73645329 0.74790859 0.71811295]
mean value: 0.7619226932525635
key: score_time
value: [0.02003074 0.01866269 0.01946402 0.01870513 0.01920581 0.01915264
0.0196712 0.02054477 0.0186789 0.01865315]
mean value: 0.019276905059814452
key: test_mcc
value: [0.35912545 0.35478491 0.30740545 0.35170341 0.27889961 0.41472585
0.37647182 0.35836582 0.39190724 0.38294764]
mean value: 0.3576337185860672
key: train_mcc
value: [0.47182028 0.48162108 0.4731965 0.47605619 0.47602501 0.4556967
0.46787394 0.46446475 0.46113939 0.45435317]
mean value: 0.4682247001831293
key: test_fscore
value: [0.50322581 0.49673203 0.46835443 0.47552448 0.41958042 0.53947368
0.50666667 0.47142857 0.5 0.47761194]
mean value: 0.4858598020684317
key: train_fscore
value: [0.57700977 0.58452292 0.57054742 0.57771039 0.57918552 0.56188307
0.57315234 0.56684492 0.56288344 0.56502242]
mean value: 0.571876218865844
key: test_precision
value: [0.56521739 0.56716418 0.51388889 0.59649123 0.52631579 0.63076923
0.6031746 0.62264151 0.64814815 0.66666667]
mean value: 0.5940477635034185
key: train_precision
value: [0.69314079 0.70216606 0.71153846 0.70295203 0.69945355 0.68391867
0.69090909 0.69606004 0.69639469 0.67379679]
mean value: 0.6950330178091753
key: test_recall
value: [0.45348837 0.44186047 0.43023256 0.39534884 0.34883721 0.47126437
0.43678161 0.37931034 0.40697674 0.37209302]
mean value: 0.4136193531141406
key: train_recall
value: [0.49420849 0.5006435 0.47619048 0.49034749 0.49420849 0.47680412
0.48969072 0.47809278 0.47232947 0.48648649]
mean value: 0.4859002043280395
key: test_accuracy
value: [0.76946108 0.76946108 0.74850299 0.7754491 0.75149701 0.79041916
0.77844311 0.77844311 0.78978979 0.78978979]
mean value: 0.7741256226286166
key: train_accuracy
value: [0.81258322 0.81591212 0.81458056 0.81458056 0.81424767 0.80792277
0.81158455 0.81125166 0.81031614 0.8063228 ]
mean value: 0.8119302050953692
key: test_roc_auc
value: [0.66626032 0.66246249 0.64455176 0.65130345 0.61998312 0.68704919
0.66778352 0.64916934 0.66502683 0.65365785]
mean value: 0.6567247869136068
key: train_roc_auc
value: [0.70893631 0.7132764 0.70441764 0.70902646 0.71005889 0.70002684
0.70669455 0.70269092 0.70025809 0.70217502]
mean value: 0.705756112778004
key: test_jcc
value: [0.3362069 0.33043478 0.30578512 0.31192661 0.26548673 0.36936937
0.33928571 0.30841121 0.33333333 0.31372549]
mean value: 0.32139652564334326
key: train_jcc
value: [0.40549102 0.41295117 0.399137 0.40618337 0.40764331 0.3907075
0.40169133 0.39552239 0.39167556 0.39375 ]
mean value: 0.4004752651708558
MCC on Blind test: 0.23
Accuracy on Blind test: 0.6
Running classifier: 2
Model_name: Bagging Classifier
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model',
BaggingClassifier(n_jobs=10, oob_score=True,
random_state=42))])
key: fit_time
value: [0.37676358 0.28589511 0.40569425 0.40436363 0.39459324 0.40798235
0.43025231 0.41989827 0.31972909 0.42993808]
mean value: 0.38751099109649656
key: score_time
value: [0.04439116 0.02395129 0.04384208 0.04165554 0.03482842 0.04738593
0.04919624 0.04067159 0.04398918 0.04068208]
mean value: 0.041059350967407225
key: test_mcc
value: [0.3405809 0.25764668 0.2857789 0.35919065 0.27110573 0.41530727
0.34021442 0.39017 0.41647676 0.37760593]
mean value: 0.3454077253112331
key: train_mcc
value: [0.95128604 0.95645586 0.95045191 0.94246909 0.95035242 0.94510621
0.95734168 0.95650159 0.95035786 0.9521301 ]
mean value: 0.9512452756729063
key: test_fscore
value: [0.46478873 0.42580645 0.40298507 0.46715328 0.39705882 0.53061224
0.46896552 0.48529412 0.52777778 0.46153846]
mean value: 0.4631980485937716
key: train_fscore
value: [0.96276596 0.9669749 0.96202532 0.956 0.96217651 0.95791583
0.96752816 0.96679947 0.96217651 0.96350365]
mean value: 0.9627866308507851
key: test_precision
value: [0.58928571 0.47826087 0.5625 0.62745098 0.54 0.65
0.5862069 0.67346939 0.65517241 0.68181818]
mean value: 0.60441644441612
key: train_precision
value: [0.99587345 0.99321574 0.99723757 0.99170124 0.99315068 0.99445215
0.99590723 0.99726027 0.99315068 0.99452055]
mean value: 0.9946469578035273
key: test_recall
value: [0.38372093 0.38372093 0.31395349 0.37209302 0.31395349 0.44827586
0.3908046 0.37931034 0.44186047 0.34883721]
mean value: 0.3776530339481422
key: train_recall
value: [0.93178893 0.94208494 0.92921493 0.92277992 0.93307593 0.92396907
0.94072165 0.93814433 0.93307593 0.93436293]
mean value: 0.9329218577929919
key: test_accuracy
value: [0.77245509 0.73353293 0.76047904 0.78143713 0.75449102 0.79341317
0.76946108 0.79041916 0.7957958 0.78978979]
mean value: 0.7741274208340077
key: train_accuracy
value: [0.98135819 0.98335553 0.9810253 0.97802929 0.9810253 0.97902796
0.98368842 0.98335553 0.98103161 0.98169717]
mean value: 0.9813594298007537
key: test_roc_auc
value: [0.6454895 0.61927982 0.61463803 0.64774006 0.61060578 0.68162781
0.6468193 0.65726651 0.6804444 0.64607852]
mean value: 0.6449989735497534
key: train_roc_auc
value: [0.96522091 0.96991988 0.96415843 0.96004286 0.96541538 0.96108687
0.96968758 0.96862333 0.96541588 0.9662838 ]
mean value: 0.9655854928613502
key: test_jcc
value: [0.30275229 0.2704918 0.25233645 0.3047619 0.24770642 0.36111111
0.30630631 0.32038835 0.35849057 0.3 ]
mean value: 0.3024345205204771
key: train_jcc
value: [0.92820513 0.93606138 0.92682927 0.91570881 0.92710997 0.91923077
0.93709884 0.93573265 0.92710997 0.92957746]
mean value: 0.9282664265188691
MCC on Blind test: 0.22
Accuracy on Blind test: 0.63
Running classifier: 3
Model_name: Decision Tree
Model func: DecisionTreeClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', DecisionTreeClassifier(random_state=42))])
key: fit_time
value: [0.22756743 0.20912504 0.18766642 0.2193892 0.21309328 0.20957923
0.21334052 0.18643594 0.19661188 0.20933342]
mean value: 0.20721423625946045
key: score_time
value: [0.01053953 0.01002789 0.01032925 0.01013303 0.01050353 0.01024294
0.00989223 0.00989199 0.01006913 0.01004767]
mean value: 0.010167717933654785
key: test_mcc
value: [0.25214126 0.1875607 0.17448855 0.26865618 0.20737731 0.21093882
0.26569184 0.24897589 0.23185199 0.23544936]
mean value: 0.22831318943542905
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.45652174 0.41081081 0.40425532 0.45238095 0.4137931 0.42622951
0.4640884 0.44571429 0.43023256 0.42424242]
mean value: 0.4328269099002431
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.42857143 0.38383838 0.37254902 0.46341463 0.40909091 0.40625
0.44680851 0.44318182 0.43023256 0.44303797]
mean value: 0.4226975236898102
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.48837209 0.44186047 0.44186047 0.44186047 0.41860465 0.44827586
0.48275862 0.44827586 0.43023256 0.40697674]
mean value: 0.44490777866880515
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.7005988 0.67365269 0.66467066 0.7245509 0.69461078 0.68562874
0.70958084 0.70958084 0.70570571 0.71471471]
mean value: 0.6983294671917426
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: [0.63128282 0.59794636 0.59189797 0.63222056 0.60446362 0.60875332
0.63611615 0.62494765 0.615926 0.61441955]
mean value: 0.6157973985416639
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.29577465 0.2585034 0.25333333 0.29230769 0.26086957 0.27083333
0.30215827 0.28676471 0.27407407 0.26923077]
mean value: 0.276384979600811
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: 0.19
Accuracy on Blind test: 0.63
Running classifier: 4
Model_name: Extra Tree
Model func: ExtraTreeClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', ExtraTreeClassifier(random_state=42))])
key: fit_time
value: [0.02538586 0.02608204 0.02541828 0.02411795 0.02470517 0.02313828
0.0252254 0.02509904 0.02517581 0.02437878]
mean value: 0.024872660636901855
key: score_time
value: [0.01116467 0.01163435 0.01115298 0.01131296 0.0109365 0.01055002
0.01117682 0.01098108 0.0112257 0.01096392]
mean value: 0.01110990047454834
key: test_mcc
value: [0.18414024 0.07174771 0.07773881 0.17036737 0.09422596 0.13605019
0.24790647 0.19343899 0.17720904 0.09949809]
mean value: 0.14523228831757945
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.40223464 0.31213873 0.30952381 0.37037037 0.33142857 0.36363636
0.43786982 0.3902439 0.38323353 0.3190184 ]
mean value: 0.36196981429206615
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.38709677 0.31034483 0.31707317 0.39473684 0.3258427 0.35955056
0.45121951 0.41558442 0.39506173 0.33766234]
mean value: 0.3694172866880629
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.41860465 0.31395349 0.30232558 0.34883721 0.3372093 0.36781609
0.42528736 0.36781609 0.37209302 0.30232558]
mean value: 0.35562683774391873
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.67964072 0.64371257 0.65269461 0.69461078 0.6497006 0.66467066
0.71556886 0.7005988 0.69069069 0.66666667]
mean value: 0.6758554962147777
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: [0.59438297 0.536009 0.53825956 0.58167667 0.54763691 0.56852343
0.62155056 0.59281493 0.58685623 0.54792392]
mean value: 0.5715634188719594
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.25174825 0.18493151 0.18309859 0.22727273 0.19863014 0.22222222
0.28030303 0.24242424 0.23703704 0.18978102]
mean value: 0.22174487682902333
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: 0.07
Accuracy on Blind test: 0.53
Running classifier: 5
Model_name: Extra Trees
Model func: ExtraTreesClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', ExtraTreesClassifier(random_state=42))])
key: fit_time
value: [0.50100851 0.50458908 0.50750947 0.5027554 0.50709867 0.50387287
0.52958059 0.52616668 0.51518559 0.50339651]
mean value: 0.5101163387298584
key: score_time
value: [0.02524161 0.02666163 0.02485824 0.02521682 0.02515721 0.0255096
0.02455211 0.02585554 0.02548218 0.02513337]
mean value: 0.025366830825805663
key: test_mcc
value: [0.31893637 0.28559185 0.25497035 0.27360117 0.25901802 0.36185945
0.35160904 0.37898496 0.3096015 0.34154512]
mean value: 0.3135717837477247
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.4 0.42253521 0.352 0.33898305 0.3442623 0.47887324
0.45588235 0.44444444 0.3968254 0.37931034]
mean value: 0.4013116335672254
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.64102564 0.53571429 0.56410256 0.625 0.58333333 0.61818182
0.63265306 0.71794872 0.625 0.73333333]
mean value: 0.6276292754864184
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.29069767 0.34883721 0.25581395 0.23255814 0.24418605 0.3908046
0.35632184 0.32183908 0.29069767 0.25581395]
mean value: 0.298757016840417
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.7754491 0.75449102 0.75748503 0.76646707 0.76047904 0.77844311
0.77844311 0.79041916 0.77177177 0.78378378]
mean value: 0.7717232202262142
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: [0.61712303 0.62199925 0.59363278 0.59208552 0.59185109 0.65289218
0.64172367 0.63865233 0.61498446 0.61171264]
mean value: 0.617665696614018
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.25 0.26785714 0.21359223 0.20408163 0.20792079 0.31481481
0.2952381 0.28571429 0.24752475 0.23404255]
mean value: 0.2520786302033054
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: 0.28
Accuracy on Blind test: 0.6
Running classifier: 6
Model_name: Gradient Boosting
Model func: GradientBoostingClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', GradientBoostingClassifier(random_state=42))])
key: fit_time
value: [3.49398565 3.51620936 3.5381515 3.47313666 3.48146534 3.53978062
3.50320053 3.50463057 3.48509049 3.46694589]
mean value: 3.5002596616744994
key: score_time
value: [0.01140976 0.01066232 0.01158452 0.01024532 0.01099324 0.01069593
0.01074028 0.01069403 0.01054358 0.0107758 ]
mean value: 0.010834479331970214
key: test_mcc
value: [0.3807871 0.45309119 0.32111713 0.4353377 0.34787497 0.45382495
0.41845803 0.46406718 0.37170564 0.45865187]
mean value: 0.4104915748944464
key: train_mcc
value: [0.60155871 0.61977357 0.62788169 0.60650006 0.62513858 0.60714967
0.61735637 0.61322412 0.61267665 0.61077596]
mean value: 0.6142035382502113
key: test_fscore
value: [0.5 0.55555556 0.4379562 0.52554745 0.45588235 0.54929577
0.53691275 0.55944056 0.46616541 0.52307692]
mean value: 0.5109832980508825
key: train_fscore
value: [0.66666667 0.68015564 0.68735454 0.6677116 0.67777778 0.66978923
0.67909868 0.67191188 0.670347 0.6749226 ]
mean value: 0.6745735614063586
key: test_precision
value: [0.62068966 0.68965517 0.58823529 0.70588235 0.62 0.70909091
0.64516129 0.71428571 0.65957447 0.77272727]
mean value: 0.6725302129156613
key: train_precision
value: [0.84117647 0.86023622 0.86523438 0.85370741 0.88405797 0.84950495
0.85518591 0.86262626 0.86558045 0.84660194]
mean value: 0.8583911964819316
key: test_recall
value: [0.41860465 0.46511628 0.34883721 0.41860465 0.36046512 0.44827586
0.45977011 0.45977011 0.36046512 0.39534884]
mean value: 0.4135257952419139
key: train_recall
value: [0.55212355 0.56241956 0.57014157 0.54826255 0.54954955 0.55283505
0.56314433 0.55025773 0.54697555 0.56113256]
mean value: 0.5556842004006952
key: test_accuracy
value: [0.78443114 0.80838323 0.76946108 0.80538922 0.77844311 0.80838323
0.79341317 0.81137725 0.78678679 0.81381381]
mean value: 0.7959882037726349
key: train_accuracy
value: [0.85719041 0.86318242 0.86584554 0.85885486 0.86484687 0.85918775
0.86251664 0.86118509 0.8608985 0.86023295]
mean value: 0.8613941034804398
key: test_roc_auc
value: [0.66494749 0.69626782 0.63207989 0.67906039 0.64192611 0.69174927
0.68535064 0.69749639 0.64784389 0.6774315 ]
mean value: 0.6714153398306997
key: train_roc_auc
value: [0.75787588 0.76526905 0.76957909 0.75774151 0.76220181 0.75936187
0.76496534 0.75986854 0.75867628 0.76283738]
mean value: 0.7618376755571367
key: test_jcc
value: [0.33333333 0.38461538 0.28037383 0.35643564 0.2952381 0.37864078
0.36697248 0.38834951 0.30392157 0.35416667]
mean value: 0.3442047292147344
key: train_jcc
value: [0.5 0.51533019 0.52364066 0.50117647 0.51260504 0.50352113
0.51411765 0.50592417 0.50415184 0.50934579]
mean value: 0.5089812940722257
MCC on Blind test: 0.38
Accuracy on Blind test: 0.71
Running classifier: 7
Model_name: Gaussian NB
Model func: GaussianNB()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', GaussianNB())])
key: fit_time
value: [0.020751 0.02083588 0.02083611 0.0214746 0.02262139 0.02305222
0.02280498 0.02320433 0.02191854 0.0239296 ]
mean value: 0.022142863273620604
key: score_time
value: [0.01026201 0.01020145 0.01035452 0.01051569 0.01060915 0.01136088
0.01134086 0.01121569 0.01098585 0.0111146 ]
mean value: 0.010796070098876953
key: test_mcc
value: [0.17611843 0.20718729 0.23922017 0.2466313 0.2213728 0.17955059
0.28994156 0.27999901 0.26828606 0.21897197]
mean value: 0.2327279183013004
key: train_mcc
value: [0.23077977 0.24343359 0.24301522 0.24483631 0.2498486 0.23703351
0.23728025 0.23569499 0.23842235 0.24955868]
mean value: 0.24099032871320855
key: test_fscore
value: [0.43801653 0.45643154 0.47457627 0.47863248 0.46017699 0.43668122
0.50655022 0.5 0.49372385 0.46086957]
mean value: 0.4705658660802502
key: train_fscore
value: [0.47042254 0.4766939 0.47646494 0.47805344 0.48114558 0.47348485
0.47338403 0.47166186 0.47401049 0.48091603]
mean value: 0.47562376539841944
key: test_precision
value: [0.33974359 0.35483871 0.37333333 0.37837838 0.37142857 0.35211268
0.4084507 0.40425532 0.38562092 0.36805556]
mean value: 0.3736217752580154
key: train_precision
value: [0.37028825 0.3803681 0.38007663 0.37983321 0.38239757 0.3742515
0.375 0.37595712 0.37651515 0.38210766]
mean value: 0.3776795180720297
key: test_recall
value: [0.61627907 0.63953488 0.65116279 0.65116279 0.60465116 0.57471264
0.66666667 0.65517241 0.68604651 0.61627907]
mean value: 0.6361668003207699
key: train_recall
value: [0.64478764 0.63835264 0.63835264 0.64478764 0.64864865 0.6443299
0.64175258 0.63273196 0.63963964 0.64864865]
mean value: 0.6422031936207194
key: test_accuracy
value: [0.59281437 0.60778443 0.62874251 0.63473054 0.63473054 0.61377246
0.66167665 0.65868263 0.63663664 0.62762763]
mean value: 0.6297198396000792
key: train_accuracy
value: [0.62450067 0.63748336 0.63715047 0.63581891 0.63814913 0.6298269
0.63115846 0.63382157 0.63294509 0.63793677]
mean value: 0.6338791317621983
key: test_roc_auc
value: [0.60047824 0.61815454 0.63606527 0.64009752 0.62490623 0.6011215
0.66329285 0.65754572 0.65273986 0.62392901]
mean value: 0.6318330736617562
key: train_roc_auc
value: [0.63110509 0.63776635 0.63754183 0.63873868 0.64156725 0.63455274
0.63461058 0.63346652 0.63512503 0.64142486]
mean value: 0.6365898936955753
key: test_jcc
value: [0.28042328 0.29569892 0.31111111 0.31460674 0.29885057 0.27932961
0.33918129 0.33333333 0.32777778 0.29943503]
mean value: 0.3079747667399205
key: train_jcc
value: [0.30755064 0.31293375 0.31273644 0.31410658 0.3167819 0.3101737
0.31008717 0.30861094 0.310625 0.31658291]
mean value: 0.31201890451058895
MCC on Blind test: 0.16
Accuracy on Blind test: 0.57
Running classifier: 8
Model_name: Gaussian Process
Model func: GaussianProcessClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', GaussianProcessClassifier(random_state=42))])
key: fit_time
value: [3.72187185 3.67817736 3.66631103 3.78374124 3.74256516 3.82478404
3.81376934 3.81057 4.06612968 4.19329572]
mean value: 3.8301215410232543
key: score_time
value: [0.10507536 0.10031843 0.10401344 0.09814143 0.10043049 0.13867617
0.10229993 0.12898898 0.10558844 0.1044693 ]
mean value: 0.10880019664764404
key: test_mcc
value: [0.15776839 0.11027014 0.14728007 0.16986396 0.16269835 0.2696399
0.220037 0.23711041 0.15616931 0.29855833]
mean value: 0.19293958615441217
key: train_mcc
value: [0.529112 0.53321743 0.52618963 0.51557166 0.52030622 0.51892879
0.51261139 0.52558463 0.52182758 0.52182758]
mean value: 0.5225176904659781
key: test_fscore
value: [0.20560748 0.18348624 0.2037037 0.17821782 0.19230769 0.29090909
0.26785714 0.24528302 0.17647059 0.28571429]
mean value: 0.22295570595449363
key: train_fscore
value: [0.52935694 0.53959484 0.52434457 0.51775701 0.51830986 0.51509434
0.50991501 0.52749301 0.52287582 0.52287582]
mean value: 0.5227617222426912
key: test_precision
value: [0.52380952 0.43478261 0.5 0.6 0.55555556 0.69565217
0.6 0.68421053 0.5625 0.78947368]
mean value: 0.5945984072500091
key: train_precision
value: [0.95945946 0.94822006 0.96219931 0.94539249 0.95833333 0.96126761
0.9540636 0.95286195 0.95238095 0.95238095]
mean value: 0.9546559729198009
key: test_recall
value: [0.12790698 0.11627907 0.12790698 0.10465116 0.11627907 0.18390805
0.17241379 0.14942529 0.10465116 0.1744186 ]
mean value: 0.13778401496925957
key: train_recall
value: [0.36550837 0.37709138 0.36036036 0.35649936 0.35521236 0.35180412
0.34793814 0.36469072 0.36036036 0.36036036]
mean value: 0.35998255250832567
key: test_accuracy
value: [0.74550898 0.73353293 0.74251497 0.75149701 0.74850299 0.76646707
0.75449102 0.76047904 0.74774775 0.77477477]
mean value: 0.7525516534498571
key: train_accuracy
value: [0.83189081 0.83355526 0.83089214 0.82822903 0.8292277 0.82889481
0.82723036 0.83122503 0.82995008 0.82995008]
mean value: 0.8301045306202933
key: test_roc_auc
value: [0.5437922 0.53192986 0.54177607 0.54022881 0.5420105 0.57778398
0.56596398 0.56256689 0.53815554 0.57911214]
mean value: 0.5523319970366736
key: train_roc_auc
value: [0.68005998 0.68495341 0.67771049 0.6746574 0.67491197 0.67343348
0.67105166 0.67920353 0.67703835 0.67703835]
mean value: 0.6770058614131267
key: test_jcc
value: [0.11458333 0.1010101 0.11340206 0.09782609 0.10638298 0.17021277
0.15463918 0.13978495 0.09677419 0.16666667]
mean value: 0.1261282309545822
key: train_jcc
value: [0.3599493 0.36948298 0.35532995 0.34930643 0.34980989 0.34688691
0.34220532 0.35822785 0.3539823 0.3539823 ]
mean value: 0.35391632307895976
MCC on Blind test: 0.23
Accuracy on Blind test: 0.52
Running classifier: 9
Model_name: K-Nearest Neighbors
Model func: KNeighborsClassifier()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', KNeighborsClassifier())])
key: fit_time
value: [0.02256298 0.0175662 0.01870489 0.01778078 0.02008438 0.0201242
0.01781631 0.01947951 0.01941681 0.0201931 ]
mean value: 0.01937291622161865
key: score_time
value: [0.04281116 0.03672481 0.02884245 0.02895713 0.02915668 0.02990818
0.02901888 0.02962327 0.02614784 0.0298295 ]
mean value: 0.03110198974609375
key: test_mcc
value: [0.12485265 0.2088331 0.07363921 0.09645358 0.2288878 0.22113287
0.1907448 0.22767493 0.14915358 0.1260079 ]
mean value: 0.16473804357313399
key: train_mcc
value: [0.43516093 0.42873496 0.43420311 0.41813312 0.40812227 0.4323308
0.42278529 0.43729541 0.41901475 0.42186249]
mean value: 0.4257643117197021
key: test_fscore
value: [0.28985507 0.33587786 0.24817518 0.2556391 0.35114504 0.36879433
0.33576642 0.37142857 0.30656934 0.24590164]
mean value: 0.31091525568905654
key: train_fscore
value: [0.52088452 0.51597052 0.52045827 0.51056911 0.49710983 0.51845775
0.50867052 0.52408163 0.50819672 0.50944947]
mean value: 0.5133848326626257
key: test_precision
value: [0.38461538 0.48888889 0.33333333 0.36170213 0.51111111 0.48148148
0.46 0.49056604 0.41176471 0.41666667]
mean value: 0.4340129737374642
key: train_precision
value: [0.71621622 0.70945946 0.71460674 0.69315673 0.69354839 0.71331828
0.70804598 0.71492205 0.69977427 0.70454545]
mean value: 0.7067593568582107
key: test_recall
value: [0.23255814 0.25581395 0.19767442 0.19767442 0.26744186 0.29885057
0.26436782 0.29885057 0.24418605 0.1744186 ]
mean value: 0.24318364073777063
key: train_recall
value: [0.40926641 0.40540541 0.40926641 0.4041184 0.38738739 0.40721649
0.39690722 0.41365979 0.3989704 0.3989704 ]
mean value: 0.40311683185394526
key: test_accuracy
value: [0.70658683 0.73952096 0.69161677 0.70359281 0.74550898 0.73353293
0.72754491 0.73652695 0.71471471 0.72372372]
mean value: 0.7222869576162989
key: train_accuracy
value: [0.80525965 0.80326232 0.80492676 0.79960053 0.79727031 0.80459387
0.80193076 0.80592543 0.80033278 0.80133111]
mean value: 0.8024433533990176
key: test_roc_auc
value: [0.55176294 0.58153601 0.53028882 0.53835334 0.58936609 0.59274513
0.57752804 0.59476942 0.56136428 0.54469918]
mean value: 0.5662413240909696
key: train_roc_auc
value: [0.67634403 0.67373997 0.67611951 0.6708513 0.66383289 0.67510735
0.66995271 0.67810458 0.6696378 0.67031105]
mean value: 0.6724001199527248
key: test_jcc
value: [0.16949153 0.20183486 0.14166667 0.14655172 0.21296296 0.22608696
0.20175439 0.22807018 0.18103448 0.14018692]
mean value: 0.18496406581483296
key: train_jcc
value: [0.35215947 0.34768212 0.35176991 0.34279476 0.33076923 0.34994463
0.34108527 0.3550885 0.34065934 0.34178611]
mean value: 0.34537393343581185
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
MCC on Blind test: 0.19
Accuracy on Blind test: 0.55
Running classifier: 10
Model_name: LDA
Model func: LinearDiscriminantAnalysis()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', LinearDiscriminantAnalysis())])
key: fit_time
value: [0.10464048 0.11896873 0.11886811 0.11711931 0.11699486 0.1187551
0.11685467 0.12559366 0.12773681 0.11569357]
mean value: 0.1181225299835205
key: score_time
value: [0.01339555 0.01343822 0.01362228 0.0135417 0.01347685 0.01352429
0.01339507 0.01327801 0.01349616 0.01351953]
mean value: 0.013468766212463379
key: test_mcc
value: [0.34402661 0.35478491 0.26726217 0.36656526 0.39556996 0.4482635
0.4433557 0.36936845 0.32470177 0.39190724]
mean value: 0.37058055626140896
key: train_mcc
value: [0.4554314 0.44851444 0.46306182 0.44799278 0.44565518 0.44793742
0.44879776 0.43113577 0.45516017 0.45116724]
mean value: 0.4494853980451551
key: test_fscore
value: [0.48684211 0.49673203 0.4084507 0.49315068 0.50704225 0.55172414
0.55405405 0.4822695 0.44604317 0.5 ]
mean value: 0.4926308635083748
key: train_fscore
value: [0.55572755 0.55230769 0.56598017 0.5475819 0.55034589 0.5532567
0.54871395 0.53613054 0.55666924 0.55193798]
mean value: 0.5518651619336193
key: test_precision
value: [0.56060606 0.56716418 0.51785714 0.6 0.64285714 0.68965517
0.67213115 0.62962963 0.58490566 0.64814815]
mean value: 0.6112954283534735
key: train_precision
value: [0.69708738 0.68642447 0.69475655 0.6950495 0.68320611 0.68241966
0.69428008 0.67514677 0.69423077 0.69395712]
mean value: 0.6896558412864507
key: test_recall
value: [0.43023256 0.44186047 0.3372093 0.41860465 0.41860465 0.45977011
0.47126437 0.3908046 0.36046512 0.40697674]
mean value: 0.4135792568831863
key: train_recall
value: [0.46203346 0.46203346 0.47747748 0.45173745 0.46074646 0.46520619
0.45360825 0.44458763 0.46460746 0.45817246]
mean value: 0.46002102986639065
key: test_accuracy
value: [0.76646707 0.76946108 0.74850299 0.77844311 0.79041916 0.80538922
0.80239521 0.78143713 0.76876877 0.78978979]
mean value: 0.7801073528618439
key: train_accuracy
value: [0.80892144 0.80625832 0.81058589 0.8069241 0.80525965 0.80592543
0.80725699 0.80126498 0.80865225 0.80765391]
mean value: 0.8068702960666976
key: test_roc_auc
value: [0.65664854 0.66246249 0.61416917 0.66091523 0.66897974 0.69344781
0.69514635 0.65491647 0.63569815 0.66502683]
mean value: 0.6607410780955046
key: train_roc_auc
value: [0.69599203 0.6941959 0.70214242 0.69129306 0.69310336 0.69490112
0.69201956 0.68504067 0.69662151 0.69385284]
mean value: 0.6939162470973728
key: test_jcc
value: [0.32173913 0.33043478 0.25663717 0.32727273 0.33962264 0.38095238
0.38317757 0.31775701 0.28703704 0.33333333]
mean value: 0.3277963780729236
key: train_jcc
value: [0.38478028 0.38150903 0.39468085 0.37701396 0.37963945 0.38241525
0.37808808 0.36624204 0.38568376 0.38115632]
mean value: 0.3811209022117284
MCC on Blind test: 0.31
Accuracy on Blind test: 0.67
Running classifier: 11
Model_name: Logistic Regression
Model func: LogisticRegression(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', LogisticRegression(random_state=42))])
key: fit_time
value: [0.07578683 0.06815028 0.07243729 0.07182741 0.07275677 0.07255435
0.07732677 0.07013202 0.07905698 0.0797956 ]
mean value: 0.07398242950439453
key: score_time
value: [0.01607132 0.0143013 0.01509905 0.01514554 0.01529074 0.01569319
0.01524568 0.01699042 0.0152235 0.01522875]
mean value: 0.015428948402404784
key: test_mcc
value: [0.36990436 0.34463506 0.31087572 0.35148736 0.36706728 0.45382495
0.41981949 0.34025026 0.28251645 0.39141237]
mean value: 0.36317932882975695
key: train_mcc
value: [0.42825347 0.43420379 0.43276322 0.4193277 0.44894459 0.41573199
0.42657287 0.41424729 0.44573659 0.42216991]
mean value: 0.42879514028154764
key: test_fscore
value: [0.48951049 0.47222222 0.43971631 0.46376812 0.47058824 0.54929577
0.52777778 0.44444444 0.40875912 0.48120301]
mean value: 0.4747285503502094
key: train_fscore
value: [0.52623211 0.53025478 0.528332 0.51177904 0.53963171 0.51364366
0.52054795 0.51166533 0.53685897 0.51302932]
mean value: 0.5231974862584894
key: test_precision
value: [0.61403509 0.5862069 0.56363636 0.61538462 0.64 0.70909091
0.66666667 0.625 0.54901961 0.68085106]
mean value: 0.6249891210722501
key: train_precision
value: [0.68814969 0.69519833 0.69537815 0.6938326 0.71398305 0.68085106
0.69462366 0.68094218 0.71125265 0.69844789]
mean value: 0.6952659270626055
key: test_recall
value: [0.40697674 0.39534884 0.36046512 0.37209302 0.37209302 0.44827586
0.43678161 0.34482759 0.3255814 0.37209302]
mean value: 0.38345362202619626
key: train_recall
value: [0.42599743 0.42857143 0.42599743 0.40540541 0.43371943 0.41237113
0.41623711 0.40979381 0.43114543 0.40540541]
mean value: 0.4194644018097627
key: test_accuracy
value: [0.78143713 0.77245509 0.76347305 0.77844311 0.78443114 0.80838323
0.79640719 0.7754491 0.75675676 0.79279279]
mean value: 0.7810028591465717
key: train_accuracy
value: [0.80159787 0.80359521 0.80326232 0.79993342 0.80858855 0.79826897
0.80193076 0.79793609 0.80765391 0.80099834]
mean value: 0.8023765428679674
key: test_roc_auc
value: [0.65913353 0.64928732 0.63184546 0.64572393 0.64975619 0.69174927
0.67992927 0.63597655 0.61623199 0.65568214]
mean value: 0.6515315648331321
key: train_roc_auc
value: [0.67932112 0.68150619 0.68044371 0.6714948 0.68654988 0.67252309
0.67625141 0.67145885 0.68505207 0.67218206]
mean value: 0.6776783171588849
key: test_jcc
value: [0.32407407 0.30909091 0.28181818 0.30188679 0.30769231 0.37864078
0.35849057 0.28571429 0.25688073 0.31683168]
mean value: 0.31211203106926244
key: train_jcc
value: [0.3570658 0.36078007 0.35900217 0.34388646 0.36951754 0.34557235
0.35185185 0.34378378 0.36692223 0.34501643]
mean value: 0.3543398698205496
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
MCC on Blind test: 0.34
Accuracy on Blind test: 0.68
Running classifier: 12
Model_name: Logistic RegressionCV
Model func: LogisticRegressionCV(cv=3, random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', LogisticRegressionCV(cv=3, random_state=42))])
key: fit_time
value: [1.06736302 1.21625853 0.90434527 0.98431587 0.90768695 0.92346883
1.11748171 0.90980577 1.09542418 0.9335928 ]
mean value: 1.005974292755127
key: score_time
value: [0.01410007 0.01383758 0.01386881 0.01383924 0.01378107 0.01396728
0.01386619 0.01352906 0.01394057 0.01382422]
mean value: 0.013855409622192384
key: test_mcc
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: train_mcc
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: test_fscore
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: train_fscore
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: test_precision
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: train_precision
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: test_recall
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: train_recall
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: test_accuracy
value: [0.74251497 0.74251497 0.74251497 0.74251497 0.74251497 0.73952096
0.73952096 0.73952096 0.74174174 0.74174174]
mean value: 0.7414621208034382
key: train_accuracy
value: [0.74134487 0.74134487 0.74134487 0.74134487 0.74134487 0.74167776
0.74167776 0.74167776 0.74143095 0.74143095]
mean value: 0.7414619553296659
key: test_roc_auc
value: [0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5]
mean value: 0.5
key: train_roc_auc
value: [0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5]
mean value: 0.5
key: test_jcc
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
key: train_jcc
value: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
mean value: 0.0
MCC on Blind test: 0.0
Accuracy on Blind test: 0.41
Running classifier: 13
Model_name: MLP
Model func: MLPClassifier(max_iter=500, random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', MLPClassifier(max_iter=500, random_state=42))])
key: fit_time
value: [2.2790122 2.77812529 4.44488597 3.27720022 1.43865538 1.75207615
1.93726707 3.10793948 3.16267252 3.22343826]
mean value: 2.7401272535324095
key: score_time
value: [0.0149641 0.0144937 0.01403475 0.01402283 0.01397443 0.01485848
0.01834154 0.01413536 0.01422024 0.01452804]
mean value: 0.014757347106933594
key: test_mcc
value: [0.29482868 0.31371491 0.31974484 0.32565357 0.29397566 0.42495679
0.42018813 0.38517239 0.32221609 0.43564636]
mean value: 0.35360974248612526
key: train_mcc
value: [0.38471683 0.42705844 0.48283596 0.43420311 0.3567245 0.39955495
0.37684521 0.42475006 0.3953492 0.45053683]
mean value: 0.4132575092541134
key: test_fscore
value: [0.42028986 0.43478261 0.45517241 0.41860465 0.34482759 0.5248227
0.51094891 0.46969697 0.38333333 0.58139535]
mean value: 0.4543874366943369
key: train_fscore
value: [0.48956661 0.50668896 0.57910906 0.52045827 0.42019838 0.50040225
0.43026436 0.51507742 0.46086192 0.59914582]
mean value: 0.5021773053922451
key: test_precision
value: [0.55769231 0.57692308 0.55932203 0.62790698 0.66666667 0.68518519
0.7 0.68888889 0.67647059 0.58139535]
mean value: 0.632045107307112
key: train_precision
value: [0.65031983 0.72315036 0.71809524 0.71460674 0.70180723 0.66595289
0.73520249 0.70066519 0.72777778 0.56960557]
mean value: 0.6907183313700919
key: test_recall
value: [0.3372093 0.34883721 0.38372093 0.31395349 0.23255814 0.42528736
0.40229885 0.35632184 0.26744186 0.58139535]
mean value: 0.3649024325046779
key: train_recall
value: [0.39253539 0.38996139 0.48519949 0.40926641 0.2998713 0.4007732
0.30412371 0.40721649 0.33719434 0.63191763]
mean value: 0.40580593480078014
key: test_accuracy
value: [0.76047904 0.76646707 0.76347305 0.7754491 0.77245509 0.7994012
0.7994012 0.79041916 0.77777778 0.78378378]
mean value: 0.7789106471741203
key: train_accuracy
value: [0.78828229 0.80359521 0.81757656 0.80492676 0.78595206 0.79327563
0.79194407 0.80193076 0.79600666 0.78136439]
mean value: 0.7964854403778878
key: test_roc_auc
value: [0.62223368 0.63006377 0.63944111 0.62471868 0.59611778 0.67823072
0.67078505 0.64982084 0.61145372 0.71782318]
mean value: 0.6440688537082616
key: train_roc_auc
value: [0.65944686 0.66893669 0.70937118 0.67611951 0.62770844 0.66537762
0.63298645 0.67331202 0.64660435 0.73270029]
mean value: 0.6692563410850292
key: test_jcc
value: [0.26605505 0.27777778 0.29464286 0.26470588 0.20833333 0.35576923
0.34313725 0.30693069 0.2371134 0.40983607]
mean value: 0.2964301542854594
key: train_jcc
value: [0.32412327 0.33930571 0.40756757 0.35176991 0.26598174 0.33369099
0.27409988 0.34687157 0.29942857 0.42770035]
mean value: 0.3370539558976439
MCC on Blind test: 0.27
Accuracy on Blind test: 0.63
Running classifier: 14
Model_name: Multinomial
Model func: MultinomialNB()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', MultinomialNB())])
key: fit_time
value: [0.02462029 0.02495074 0.02500033 0.02489638 0.02492046 0.02500868
0.02481937 0.02489829 0.02509499 0.02540708]
mean value: 0.02496166229248047
key: score_time
value: [0.01324105 0.01321006 0.01316714 0.01324892 0.01322603 0.01326823
0.01322246 0.01321411 0.01330686 0.01321483]
mean value: 0.013231968879699707
key: test_mcc
value: [0.06689167 0.12224273 0.17174784 0.12925757 0.10833126 0.16722219
0.17903549 0.12411626 0.18926435 0.22873409]
mean value: 0.14868434558076712
key: train_mcc
value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
[0.1840496 0.17085934 0.17561632 0.16541298 0.17626359 0.16836062
0.15277144 0.17374335 0.17661568 0.16061692]
mean value: 0.1704309843078074
key: test_fscore
value: [0.28205128 0.30344828 0.37575758 0.31292517 0.30463576 0.34666667
0.35761589 0.30344828 0.37419355 0.39735099]
mean value: 0.33580934436614085
key: train_fscore
value: [0.38199181 0.35268185 0.36531628 0.34234234 0.36258993 0.35857143
0.32876712 0.3539823 0.35640648 0.34437086]
mean value: 0.35470204059938004
key: test_precision
value: [0.31428571 0.37288136 0.39240506 0.37704918 0.35384615 0.41269841
0.421875 0.37931034 0.42028986 0.46153846]
mean value: 0.3906179541820004
key: train_precision
value: [0.40638607 0.4109589 0.40793651 0.41081081 0.41109299 0.40224359
0.40148699 0.4137931 0.41652324 0.40206186]
mean value: 0.4083294048448337
key: test_recall
value: [0.25581395 0.25581395 0.36046512 0.26744186 0.26744186 0.29885057
0.31034483 0.25287356 0.3372093 0.34883721]
mean value: 0.2955092221331196
key: train_recall
value: [0.36036036 0.30888031 0.33075933 0.29343629 0.32432432 0.32345361
0.27835052 0.30927835 0.31145431 0.3011583 ]
mean value: 0.31414557046000346
key: test_accuracy
value: [0.66467066 0.69760479 0.69161677 0.69760479 0.68562874 0.70658683
0.70958084 0.69760479 0.70870871 0.72672673]
mean value: 0.698633363902825
key: train_accuracy
value: [0.69840213 0.70672437 0.70272969 0.70838881 0.70505992 0.70106525
0.70639148 0.70838881 0.70915141 0.70349418]
mean value: 0.704979605672747
key: test_roc_auc
value: [0.53113278 0.5533102 0.58345836 0.55710803 0.54904351 0.5745265
0.58027363 0.55356229 0.58763299 0.6035684 ]
mean value: 0.5673616699669124
key: train_roc_auc
value: [0.58835261 0.57720621 0.58163472 0.57330099 0.58111142 0.57801944
0.56691314 0.57833756 0.57964996 0.5724822 ]
mean value: 0.5777008249804448
key: test_jcc
value: [0.1641791 0.17886179 0.23134328 0.18548387 0.1796875 0.20967742
0.21774194 0.17886179 0.23015873 0.24793388]
mean value: 0.20239293055581764
key: train_jcc
value: [0.23608769 0.21409456 0.22347826 0.20652174 0.22144112 0.21845083
0.19672131 0.21505376 0.21684588 0.208 ]
mean value: 0.21566951527820008
MCC on Blind test: 0.2
Accuracy on Blind test: 0.58
Running classifier: 15
Model_name: Naive Bayes
Model func: BernoulliNB()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', BernoulliNB())])
key: fit_time
value: [0.02839565 0.02792978 0.02778769 0.02802801 0.02784538 0.02823591
0.02779031 0.02787352 0.02834773 0.02785993]
mean value: 0.02800939083099365
key: score_time
value: [0.01390409 0.01375246 0.01380038 0.01367593 0.01363206 0.01377678
0.01385999 0.01368332 0.01375628 0.01391125]
mean value: 0.013775253295898437
key: test_mcc
value: [0.16866215 0.08117412 0.13505628 0.1622417 0.08158273 0.09698404
0.13098725 0.14578827 0.13522487 0.24620577]
mean value: 0.1383907185321174
key: train_mcc
value: [0.15800921 0.17104907 0.15651911 0.16884287 0.17387815 0.1678983
0.16346678 0.16052526 0.1547298 0.15569669]
mean value: 0.1630615251884336
key: test_fscore
value: [0.31343284 0.2739726 0.27692308 0.31111111 0.25899281 0.27142857
0.31292517 0.2962963 0.28571429 0.37313433]
mean value: 0.29739310842155947
key: train_fscore
value: [0.33078101 0.32419355 0.30348259 0.31751227 0.32771084 0.32064777
0.31136738 0.30679934 0.30806846 0.29679595]
mean value: 0.31473591674338364
key: test_precision
value: [0.4375 0.33333333 0.40909091 0.42857143 0.33962264 0.35849057
0.38333333 0.41666667 0.40425532 0.52083333]
mean value: 0.403169753102511
key: train_precision
value: [0.40831758 0.43412527 0.42657343 0.43595506 0.43589744 0.43137255
0.43150685 0.43023256 0.42 0.43031785]
mean value: 0.4284298573854273
key: test_recall
value: [0.24418605 0.23255814 0.20930233 0.24418605 0.20930233 0.2183908
0.26436782 0.22988506 0.22093023 0.29069767]
mean value: 0.2363806468858594
key: train_recall
value: [0.27799228 0.25868726 0.23552124 0.24967825 0.26254826 0.25515464
0.2435567 0.23840206 0.24324324 0.22651223]
mean value: 0.24912961562446098
key: test_accuracy
value: [0.7245509 0.68263473 0.71856287 0.72155689 0.69161677 0.69461078
0.69760479 0.71556886 0.71471471 0.74774775]
mean value: 0.710916904928881
key: train_accuracy
value: [0.70905459 0.72103862 0.72037284 0.72237017 0.7213715 0.72070573
0.72170439 0.72170439 0.71747088 0.72246256]
mean value: 0.7198255681276877
key: test_roc_auc
value: [0.56765754 0.53563391 0.55223181 0.56564141 0.53408665 0.54036949
0.55728512 0.55826237 0.55378495 0.59879013]
mean value: 0.5563743380700682
key: train_roc_auc
value: [0.56872223 0.5705201 0.56252936 0.56848529 0.57200157 0.56900461
0.56589864 0.56421898 0.56304891 0.56096706]
mean value: 0.5665396751777358
key: test_jcc
value: [0.18584071 0.15873016 0.16071429 0.18421053 0.14876033 0.15702479
0.18548387 0.17391304 0.16666667 0.2293578 ]
mean value: 0.1750702181969585
key: train_jcc
value: [0.19816514 0.19345525 0.17888563 0.18871595 0.19596542 0.19093539
0.18439024 0.18119491 0.18208092 0.17425743]
mean value: 0.18680462767204709
MCC on Blind test: 0.18
Accuracy on Blind test: 0.52
Running classifier: 16
Model_name: Passive Aggresive
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model',
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
key: fit_time
value: [0.0580802 0.05403805 0.04128051 0.04302478 0.06841564 0.0539484
0.04034019 0.06294227 0.06180072 0.05050826]
mean value: 0.05343790054321289
key: score_time
value: [0.01111007 0.01337314 0.01358175 0.01412654 0.01246071 0.01389861
0.01351237 0.01339674 0.01381898 0.01375675]
mean value: 0.013303565979003906
key: test_mcc
value: [0.39733574 0.21856376 0.11766121 0. 0.32565357 0.
0.30854417 0.275889 0. 0.35540957]
mean value: 0.1999057023425324
key: train_mcc
value: [0.40281168 0.25635055 0.18708294 0.04097181 0.38768338 0.04373573
0.25166209 0.29968361 0. 0.30150991]
mean value: 0.21714916901801332
key: test_fscore
value: [0.54545455 0.36879433 0.42857143 0. 0.41860465 0.
0.51505017 0.49681529 0. 0.40336134]
mean value: 0.31766517498159985
key: train_fscore
value: [0.54595336 0.38530612 0.44959374 0.00768246 0.47434819 0.00514139
0.48509091 0.50834879 0. 0.32636816]
mean value: 0.3187833124490242
key: test_precision /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
value: [0.56962025 0.47272727 0.28057554 0. 0.62790698 0.
0.36320755 0.34361233 0. 0.72727273]
mean value: 0.3384922651448662
key: train_precision
value: [0.58443465 0.52678571 0.29340141 0.75 0.68446602 1.
0.3378926 0.35695675 0. 0.71929825]
mean value: 0.525323540037564
key: test_recall
value: [0.52325581 0.30232558 0.90697674 0. 0.31395349 0.
0.88505747 0.89655172 0. 0.27906977]
mean value: 0.41071905907511363
key: train_recall
value: [0.51222651 0.3037323 0.96138996 0.003861 0.36293436 0.00257732
0.85953608 0.88273196 0. 0.21106821]
mean value: 0.41000577160370977
key: test_accuracy
value: [0.7754491 0.73353293 0.37724551 0.74251497 0.7754491 0.73952096
0.56586826 0.52694611 0.74174174 0.78678679]
mean value: 0.6765055474636312
key: train_accuracy
value: [0.77962716 0.74933422 0.39114514 0.74201065 0.79194407 0.74234354
0.5286285 0.55892144 0.74143095 0.77470882]
mean value: 0.6800094494085535
key: test_roc_auc
value: [0.69307952 0.59269505 0.55026257 0.5 0.62471868 0.5
0.66924938 0.64665643 0.5 0.62131626]
mean value: 0.5897977886468685
key: train_roc_auc
value: [0.69257486 0.60426849 0.57678838 0.50170598 0.65227993 0.50128866
0.63645565 0.664436 0.5 0.59117145]
mean value: 0.5920969408097997
key: test_jcc
value: [0.375 0.22608696 0.27272727 0. 0.26470588 0.
0.34684685 0.33050847 0. 0.25263158]
mean value: 0.20685070119724394
key: train_jcc
value: [0.3754717 0.23862487 0.28998447 0.00385604 0.3109151 0.00257732
0.32021123 0.34079602 0. 0.19500595]
mean value: 0.20774427082333577
MCC on Blind test: 0.27
Accuracy on Blind test: 0.66
Running classifier: 17
Model_name: QDA
Model func: QuadraticDiscriminantAnalysis()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', QuadraticDiscriminantAnalysis())])
key: fit_time
value: [0.09404349 0.06870151 0.06382132 0.06490541 0.06720114 0.10275173
0.10124159 0.07026529 0.06393051 0.06361794]
mean value: 0.07604799270629883
key: score_time
value: [0.01468492 0.01478004 0.01465011 0.01468658 0.01470542 0.02365756
0.02400112 0.01464987 0.01463866 0.01454282]
mean value: 0.016499710083007813
key: test_mcc
value: [0.12317212 0.00383982 0.06157074 0.04009662 0.07989651 0.09575692
0.07536308 0.05663375 0.09663183 0.12815419]
mean value: 0.0761115581332121
key: train_mcc
value: [0.1194382 0.12359198 0.12912099 0.12152995 0.12152995 0.12142446
0.11985988 0.12501128 0.12043997 0.11886315]
mean value: 0.12208098119455464
key: test_fscore
value: [0.42364532 0.40594059 0.41481481 0.41176471 0.41791045 0.42364532
0.42079208 0.41747573 0.42211055 0.42574257]
mean value: 0.4183842137296362
key: train_fscore
value: [0.42424242 0.425171 0.42645445 0.4247062 0.4247062 0.42427556
0.42392789 0.42508902 0.42435827 0.42401091]
mean value: 0.42469419301975064
key: test_precision
value: [0.26875 0.25786164 0.26332288 0.26086957 0.26582278 0.26959248
0.2681388 0.26461538 0.26923077 0.27044025]
mean value: 0.26586445524295216
key: train_precision
value: [0.26923077 0.26997915 0.271015 0.26960444 0.26960444 0.26925746
0.26897747 0.26991304 0.26932409 0.26904432]
mean value: 0.26959501870932223
key: test_recall
value: [1. 0.95348837 0.97674419 0.97674419 0.97674419 0.98850575
0.97701149 0.98850575 0.97674419 1. ]
mean value: 0.9814488104784816
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.2994012 0.28143713 0.29041916 0.28143713 0.2994012 0.2994012
0.2994012 0.28143713 0.30930931 0.3033033 ]
mean value: 0.29449479419539304
key: train_accuracy
value: [0.29793609 0.3005992 0.30426099 0.29926764 0.29926764 0.29893475
0.29793609 0.30126498 0.2985025 0.29750416]
mean value: 0.29954740324049356
key: test_roc_auc
value: [0.52822581 0.50093773 0.51458177 0.50853338 0.52063016 0.52259295
0.51887012 0.51044721 0.52683363 0.53036437]
mean value: 0.5182017135845458
key: train_roc_auc
value: [0.52649304 0.52828918 0.53075887 0.52739111 0.52739111 0.52737882
0.52670557 0.52894973 0.52692998 0.52625673]
mean value: 0.5276544130747259
key: test_jcc
value: [0.26875 0.25465839 0.26168224 0.25925926 0.26415094 0.26875
0.26645768 0.26380368 0.26751592 0.27044025]
mean value: 0.26454683671108925
key: train_jcc
value: [0.26923077 0.26997915 0.271015 0.26960444 0.26960444 0.26925746
0.26897747 0.26991304 0.26932409 0.26904432]
mean value: 0.26959501870932223
MCC on Blind test: 0.06
Accuracy on Blind test: 0.59
Running classifier: 18
Model_name: Random Forest
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model',
RandomForestClassifier(n_estimators=1000, random_state=42))])
key: fit_time
value: [9.48924208 9.43958592 9.63640451 9.43083119 9.64917517 9.64940453
9.62794304 9.88782573 9.46423697 9.46671748]
mean value: 9.574136662483216
key: score_time
value: [0.14017081 0.1395781 0.13968158 0.14495087 0.14978433 0.14100695
0.14059258 0.15260839 0.1451242 0.15080047]
mean value: 0.1444298267364502
key: test_mcc
value: [0.35244469 0.30646828 0.3052084 0.3678787 0.32811187 0.37914266
0.40642734 0.35563376 0.38253454 0.36633241]
mean value: 0.35501826527312813
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.45112782 0.43165468 0.40310078 0.41666667 0.40322581 0.47407407
0.5106383 0.41935484 0.448 0.40677966]
mean value: 0.43646226157929846
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.63829787 0.56603774 0.60465116 0.73529412 0.65789474 0.66666667
0.66666667 0.7027027 0.71794872 0.75 ]
mean value: 0.6706160379454098
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.34883721 0.34883721 0.30232558 0.29069767 0.29069767 0.36781609
0.4137931 0.29885057 0.3255814 0.27906977]
mean value: 0.32665062817428503
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.78143713 0.76347305 0.76946108 0.79041916 0.77844311 0.78742515
0.79341317 0.78443114 0.79279279 0.78978979]
mean value: 0.7831085576594559
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
[0.64014441 0.62804764 0.6168886 0.62720368 0.61913916 0.65151938
0.6704593 0.62715808 0.64052349 0.62334055]
mean value: 0.6344424291452592
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.29126214 0.27522936 0.25242718 0.26315789 0.25252525 0.31067961
0.34285714 0.26530612 0.28865979 0.25531915]
mean value: 0.279742364515582
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: 0.32
Accuracy on Blind test: 0.65
Running classifier: 19
Model_name: Random Forest2
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
n_estimators=1000, n_jobs=10, oob_score=True,
random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model',
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
n_estimators=1000, n_jobs=10,
oob_score=True, random_state=42))])
key: fit_time
value: [2.06348848 2.04826379 2.01603174 2.10286808 2.02492619 2.06884789
2.0610857 2.03673744 2.01752186 2.12582326]
mean value: 2.056559443473816
key: score_time
value: [0.37546301 0.34496713 0.31308508 0.36820054 0.3853426 0.37358189
0.36383128 0.36255455 0.15650249 0.28507233]
mean value: 0.33286008834838865
key: test_mcc
value: [0.3375735 0.32565357 0.30654765 0.35340752 0.33280424 0.32166823
0.35160904 0.43259374 0.36633241 0.413438 ]
mean value: 0.3541627906622512
key: train_mcc
value: [0.76454974 0.77252831 0.76799346 0.76545792 0.77037834 0.77085639
0.77466934 0.76722864 0.75567105 0.76982717]
mean value: 0.767916036106461
key: test_fscore
value: [0.40650407 0.41860465 0.38709677 0.3826087 0.38655462 0.3902439
0.45588235 0.45378151 0.40677966 0.43103448]
mean value: 0.41190907196587156
key: train_fscore
value: [0.79414032 0.80275229 0.7981581 0.79506934 0.79907621 0.8
0.80337942 0.7962963 0.7844358 0.8 ]
mean value: 0.7973307774259772
key: test_precision
value: [0.67567568 0.62790698 0.63157895 0.75862069 0.6969697 0.66666667
0.63265306 0.84375 0.75 0.83333333]
mean value: 0.7117155047637642
key: train_precision
value: [0.99038462 0.98870056 0.98859316 0.99040307 0.99425287 0.99236641
0.99429658 0.99230769 0.99212598 0.98863636]
mean value: 0.9912067311186927
key: test_recall
value: [0.29069767 0.31395349 0.27906977 0.25581395 0.26744186 0.27586207
0.35632184 0.31034483 0.27906977 0.29069767]
mean value: 0.29192729216786956
key: train_recall
value: [0.66280566 0.67567568 0.66924067 0.66409266 0.66795367 0.67010309
0.67396907 0.66494845 0.64864865 0.67181467]
mean value: 0.6669252278788362
key: test_accuracy
value: [0.78143713 0.7754491 0.77245509 0.78742515 0.78143713 0.7754491
0.77844311 0.80538922 0.78978979 0.8018018 ]
mean value: 0.7849076621531711
key: train_accuracy
value: [0.91111851 0.91411451 0.91245007 0.9114514 0.91311585 0.91344874
0.91478029 0.91211718 0.9078203 0.91314476]
mean value: 0.9123561596185674
key: test_roc_auc
value: [0.62115529 0.62471868 0.61130908 0.61379407 0.61355964 0.61363954
0.64172367 0.64505096 0.62334055 0.63522738]
mean value: 0.6243518856033288
key: train_roc_auc
value: [0.83028024 0.83649073 0.83327323 0.83092375 0.83330328 0.83415388
0.83631129 0.83157656 0.82342666 0.83456084]
mean value: 0.8324300460340233
key: test_jcc
value: [0.25510204 0.26470588 0.24 0.23655914 0.23958333 0.24242424
0.2952381 0.29347826 0.25531915 0.27472527]
mean value: 0.2597135418480895
key: train_jcc
value: [0.65856777 0.67049808 0.66411239 0.65984655 0.66538462 0.66666667
0.67137356 0.66153846 0.6453265 0.66666667]
mean value: 0.6629981265370812
MCC on Blind test: 0.35
Accuracy on Blind test: 0.66
Running classifier: 20
Model_name: Ridge Classifier
Model func: RidgeClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', RidgeClassifier(random_state=42))])
key: fit_time
value: [0.07881427 0.04033756 0.05301142 0.041677 0.04464722 0.04210424
0.04134798 0.04183865 0.04093719 0.040591 ]
mean value: 0.046530652046203616
key: score_time
value: [0.01318431 0.02713323 0.03056479 0.0271554 0.02807665 0.03538179
0.03617311 0.03478241 0.02740431 0.02778244]
mean value: 0.028763842582702637
key: test_mcc
value: [0.35577288 0.3478005 0.28050818 0.32916131 0.38062211 0.50666918
0.39017 0.38084356 0.25603702 0.39141237]
mean value: 0.36189971032145635
key: train_mcc
value: [0.42885627 0.43680534 0.44531435 0.42445002 0.43799107 0.42953857
0.43681603 0.41888833 0.43532634 0.4305393 ]
mean value: 0.43245256334649734
key: test_fscore
value: [0.45925926 0.46808511 0.37795276 0.42748092 0.46969697 0.57971014
0.48529412 0.453125 0.36923077 0.48120301]
mean value: 0.4571038046599415
key: train_fscore
value: [0.50671141 0.51741294 0.52325581 0.5033557 0.51864126 0.50796312
0.50723404 0.49747049 0.51092437 0.50590219]
mean value: 0.5098871334463608
key: test_precision
value: [0.63265306 0.6 0.58536585 0.62222222 0.67391304 0.78431373
0.67346939 0.70731707 0.54545455 0.68085106]
mean value: 0.6505559976283872
key: train_precision
value: [0.72771084 0.72727273 0.73770492 0.72289157 0.72790698 0.72661871
0.74686717 0.7195122 0.73607748 0.73349633]
mean value: 0.7306058914124508
key: test_recall
value: [0.36046512 0.38372093 0.27906977 0.3255814 0.36046512 0.45977011
0.37931034 0.33333333 0.27906977 0.37209302]
mean value: 0.3532878909382518
key: train_recall
value: [0.38867439 0.4015444 0.40540541 0.38610039 0.4028314 0.39046392
0.38402062 0.38015464 0.39124839 0.38610039]
mean value: 0.3916543937162494
key: test_accuracy
value: [0.78143713 0.7754491 0.76347305 0.7754491 0.79041916 0.82634731
0.79041916 0.79041916 0.75375375 0.79279279]
mean value: 0.7839959720199242
key: train_accuracy
value: [0.80426099 0.80625832 0.80892144 0.80292943 0.80659121 0.80459387
0.80725699 0.80159787 0.8063228 0.80499168]
mean value: 0.8053724595713758
key: test_roc_auc
value: [0.64394224 0.64750563 0.60526069 0.6285165 0.64999062 0.70761785
0.65726651 0.64237517 0.59904905 0.65568214]
mean value: 0.6437206399970089
key: train_roc_auc
value: [0.66896674 0.67450368 0.67755677 0.6672307 0.67514718 0.66964848
0.66934424 0.66426942 0.6711628 0.66858879]
mean value: 0.670641879401966
key: test_jcc
value: [0.29807692 0.30555556 0.23300971 0.27184466 0.30693069 0.40816327
0.32038835 0.29292929 0.22641509 0.31683168]
mean value: 0.29801452258917427
key: train_jcc
value: [0.33932584 0.34899329 0.35433071 0.33632287 0.35011186 0.34044944
0.33979475 0.33108866 0.34311512 0.33860045]
mean value: 0.3422132999818152
MCC on Blind test: 0.35
Accuracy on Blind test: 0.68
Running classifier: 21
Model_name: Ridge ClassifierCV
Model func: RidgeClassifierCV(cv=3)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', RidgeClassifierCV(cv=3))])
key: fit_time
value: [0.21587157 0.19852281 0.19795561 0.19829607 0.20926809 0.21731615
0.21976256 0.23083353 0.15375948 0.22372222]
mean value: 0.20653080940246582
key: score_time
value: [0.02084088 0.02056837 0.02355838 0.02055526 0.02059078 0.03319097
0.0207715 0.02839875 0.01439953 0.02065754]
mean value: 0.022353196144104005
key: test_mcc
value: [0.3478005 0.3478005 0.22329621 0.28893975 0.3943876 0.47637366
0.42018813 0.33148665 0.30613246 0.39141237]
mean value: 0.35278178297837276
key: train_mcc
value: [0.43126835 0.43680534 0.41418147 0.40000748 0.41030851 0.38363164
0.43934752 0.39757249 0.40168487 0.43368354]
mean value: 0.4148491204252315
key: test_fscore
value: [0.46808511 0.46808511 0.31404959 0.38095238 0.46031746 0.54814815
0.51094891 0.39344262 0.38709677 0.48120301]
mean value: 0.4412329098733461
key: train_fscore
value: [0.51286307 0.51741294 0.48671808 0.46956522 0.48192771 0.45053004
0.51510067 0.46544182 0.47272727 0.51170569]
mean value: 0.4883992499241053
key: test_precision
value: [0.6 0.6 0.54285714 0.6 0.725 0.77083333
0.7 0.68571429 0.63157895 0.68085106]
mean value: 0.653683477310297
key: train_precision
value: [0.72196262 0.72727273 0.72820513 0.72386059 0.72727273 0.71629213
0.73798077 0.72479564 0.72222222 0.73031026]
mean value: 0.7260174818526606
key: test_recall
value: [0.38372093 0.38372093 0.22093023 0.27906977 0.3372093 0.42528736
0.40229885 0.27586207 0.27906977 0.37209302]
mean value: 0.3359262229350441
key: train_recall
value: [0.3976834 0.4015444 0.36550837 0.34749035 0.36036036 0.32860825
0.39561856 0.34278351 0.35135135 0.39382239]
mean value: 0.36847709270389684
key: test_accuracy
value: [0.7754491 0.7754491 0.75149701 0.76646707 0.79640719 0.81736527
0.7994012 0.77844311 0.77177177 0.79279279]
mean value: 0.7825043606480733
key: train_accuracy
value: [0.80459387 0.80625832 0.8005992 0.79693742 0.79960053 0.79294274
0.80758988 0.79660453 0.79733777 0.80565724]
mean value: 0.800812150632213
key: test_roc_auc
value: [0.64750563 0.64750563 0.57820705 0.60727682 0.64642723 0.69037647
0.67078505 0.61566383 0.6111948 0.65568214]
mean value: 0.6370624650002372
key: train_roc_auc
value: [0.67212414 0.67450368 0.65895535 0.65061989 0.65660586 0.64163806
0.67334788 0.64872568 0.65211194 0.67155213]
mean value: 0.6600184617689904
key: test_jcc
value: [0.30555556 0.30555556 0.18627451 0.23529412 0.29896907 0.37755102
0.34313725 0.24489796 0.24 0.31683168]
mean value: 0.2854066728389154
key: train_jcc
value: [0.34486607 0.34899329 0.3216308 0.30681818 0.31746032 0.29076397
0.34689266 0.30330673 0.30952381 0.34382022]
mean value: 0.32340760485378484
MCC on Blind test: 0.35
Accuracy on Blind test: 0.67
Running classifier: 22
Model_name: SVC
Model func: SVC(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', SVC(random_state=42))])
key: fit_time
value: [0.42873073 0.44877243 0.4410708 0.439713 0.45077801 0.4527967
0.44538975 0.42600179 0.44786954 0.45865536]
mean value: 0.4439778089523315
key: score_time
value: [0.10152221 0.10636616 0.10989714 0.10888743 0.10870147 0.10489702
0.11190867 0.10766673 0.10169482 0.11362314]
mean value: 0.1075164794921875
key: test_mcc
value: [0.15572814 0.19769748 0.14401845 0.28258961 0.12931847 0.23501016
0.28212421 0.28862177 0.25246517 0.34443127]
mean value: 0.23120047359115387
key: train_mcc
value: [0.34881189 0.3529105 0.34345466 0.32493772 0.35911631 0.3165067
0.2951497 0.31895966 0.31867635 0.30055023]
mean value: 0.32790737301726625
key: test_fscore
value: [0.12631579 0.22641509 0.17475728 0.18947368 0.14141414 0.23076923
0.25242718 0.22222222 0.22 0.30769231]
mean value: 0.20914869361411528
key: train_fscore
value: [0.3256785 0.32352941 0.31578947 0.28632939 0.32169312 0.27233115
0.24751381 0.27922078 0.27891892 0.25982533]
mean value: 0.291082988293695
key: test_precision
value: [0.66666667 0.6 0.52941176 1. 0.53846154 0.70588235
0.8125 0.91666667 0.78571429 0.88888889]
mean value: 0.7444192164045104
key: train_precision
value: [0.86187845 0.88 0.86705202 0.875 0.9047619 0.88028169
0.86821705 0.87162162 0.87162162 0.85611511]
mean value: 0.8736549476483291
key: test_recall
value: [0.06976744 0.13953488 0.10465116 0.10465116 0.08139535 0.13793103
0.14942529 0.12643678 0.12790698 0.18604651]
mean value: 0.1227746591820369
key: train_recall
value: [0.2007722 0.1981982 0.19305019 0.17117117 0.1956242 0.16108247
0.1443299 0.16623711 0.16602317 0.15315315]
mean value: 0.17496417625283603
key: test_accuracy
value: [0.75149701 0.75449102 0.74550898 0.76946108 0.74550898 0.76047904
0.76946108 0.76946108 0.76576577 0.78378378]
mean value: 0.7615417813022602
key: train_accuracy
value: [0.7849534 0.78561917 0.78362184 0.77929427 0.78661784 0.77762983
0.77330226 0.77829561 0.77803661 0.77437604]
mean value: 0.7801746866629298
key: test_roc_auc
value: [0.52883533 0.55363841 0.53619655 0.55232558 0.5286009 0.55884406
0.56863977 0.5611941 0.55788061 0.58897467]
mean value: 0.5535129989132072
key: train_roc_auc
value: [0.59477317 0.59438424 0.5913612 0.58131976 0.59421982 0.57672616
0.56834987 0.57885464 0.57874767 0.57208825]
mean value: 0.5830824763760619
key: test_jcc
value: [0.06741573 0.12765957 0.09574468 0.10465116 0.07608696 0.13043478
0.14444444 0.125 0.12359551 0.18181818]
mean value: 0.11768510194579637
key: train_jcc
value: [0.19451372 0.19298246 0.1875 0.16708543 0.19167718 0.15762926
0.14123581 0.16226415 0.1620603 0.14930991]
mean value: 0.17062582082489314
MCC on Blind test: 0.2
Accuracy on Blind test: 0.52
Running classifier: 23
Model_name: Stochastic GDescent
Model func: SGDClassifier(n_jobs=10, random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', SGDClassifier(n_jobs=10, random_state=42))])
key: fit_time
value: [0.09107947 0.11798787 0.14775229 0.11377883 0.10112262 0.09877563
0.09433532 0.12147713 0.10012531 0.10663843]
mean value: 0.10930728912353516
key: score_time
value: [0.01101947 0.01209211 0.01185465 0.01186705 0.01130056 0.01145649
0.01151109 0.01198649 0.01131582 0.01181626]
mean value: 0.011621999740600585
key: test_mcc
value: [0.40569235 0.39149002 0.09744004 0.41328644 0.31792392 0.29601326
0.38486953 0.39168576 0.21723096 0.36618239]
mean value: 0.32818146607196164
key: train_mcc
value: [0.41253427 0.46265058 0.27222353 0.44311682 0.3832757 0.23120321
0.39544019 0.36688773 0.27843283 0.26085338]
mean value: 0.35066182372828864
key: test_fscore
value: [0.52702703 0.54216867 0.10416667 0.5698324 0.36206897 0.28301887
0.48920863 0.56505576 0.18367347 0.36363636]
mean value: 0.39898568322683237
key: train_fscore
value: [0.52743902 0.59050241 0.22172452 0.59729571 0.41438032 0.20643729
0.48459617 0.55042017 0.2516269 0.23245614]
mean value: 0.407687865556983
key: test_precision
value: [0.62903226 0.5625 0.5 0.5483871 0.7 0.78947368
0.65384615 0.41758242 0.75 0.83333333]
mean value: 0.6384154943811141
key: train_precision
value: [0.64672897 0.63461538 0.85344828 0.54978355 0.78214286 0.744
0.68470588 0.40835411 0.8 0.78518519]
mean value: 0.688896422161782
key: test_recall
value: [0.45348837 0.52325581 0.05813953 0.59302326 0.24418605 0.17241379
0.3908046 0.87356322 0.10465116 0.23255814]
mean value: 0.36460839347767976
key: train_recall
value: [0.44530245 0.55212355 0.12741313 0.65379665 0.28185328 0.11984536
0.375 0.84407216 0.14929215 0.13642214]
mean value: 0.3685120871976542
key: test_accuracy
value: [0.79041916 0.77245509 0.74251497 0.76946108 0.77844311 0.77245509
0.78742515 0.6497006 0.75975976 0.78978979]
mean value: 0.7612423801046556
key: train_accuracy
value: [0.79360852 0.80193076 0.76864181 0.77197071 0.79394141 0.76198402
0.79394141 0.64380826 0.7703827 0.76705491]
mean value: 0.7667264501463384
key: test_roc_auc
value: [0.68037322 0.69106339 0.51898912 0.71183421 0.60394786 0.57810973
0.65896505 0.72220671 0.54625271 0.6081819 ]
mean value: 0.6319923905484259
key: train_roc_auc
value: [0.68021746 0.72060601 0.55988977 0.73349914 0.62723109 0.55274135
0.65742819 0.70906481 0.568138 0.56170299]
mean value: 0.6370518783799006
key: test_jcc
value: [0.35779817 0.37190083 0.05494505 0.3984375 0.22105263 0.16483516
0.32380952 0.39378238 0.1011236 0.22222222]
mean value: 0.2609907067900116
key: train_jcc
value: [0.35817805 0.41894531 0.12468514 0.42581727 0.26133652 0.11509901
0.31978022 0.37971014 0.1439206 0.13151365]
mean value: 0.2678985905560448
MCC on Blind test: 0.29
Accuracy on Blind test: 0.64
Running classifier: 24
Model_name: XGBoost
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
colsample_bynode=None, colsample_bytree=None,
enable_categorical=False, gamma=None, gpu_id=None,
importance_type=None, interaction_constraints=None,
learning_rate=None, max_delta_step=None, max_depth=None,
min_child_weight=None, missing=nan, monotone_constraints=None,
n_estimators=100, n_jobs=None, num_parallel_tree=None,
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
scale_pos_weight=None, subsample=None, tree_method=None,
use_label_encoder=False, validate_parameters=None, verbosity=0)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', '...
interaction_constraints=None, learning_rate=None,
max_delta_step=None, max_depth=None,
min_child_weight=None, missing=nan,
monotone_constraints=None, n_estimators=100,
n_jobs=None, num_parallel_tree=None,
predictor=None, random_state=42, reg_alpha=None,
reg_lambda=None, scale_pos_weight=None,
subsample=None, tree_method=None,
use_label_encoder=False,
validate_parameters=None, verbosity=0))])
key: fit_time
value: [0.85974932 0.39687037 0.40505552 0.40411663 0.55167603 0.40954828
0.40673447 0.40257382 0.54897761 0.41880846]
mean value: 0.4804110527038574
key: score_time
value: [0.01206374 0.01206017 0.01265001 0.0129106 0.01298714 0.01363111
0.01232481 0.01240969 0.01239753 0.01208806]
mean value: 0.012552285194396972
key: test_mcc
value: [0.30963425 0.36545433 0.3352472 0.51551197 0.3352472 0.38562087
0.40860486 0.40106786 0.46554789 0.36577646]
mean value: 0.3887712898058101
key: train_mcc
value: [1. 1. 1. 1. 1. 0.99913127
1. 1. 1. 0.9991321 ]
mean value: 0.9998263361569334
key: test_fscore
value: [0.46052632 0.50649351 0.47297297 0.6013986 0.47297297 0.52830189
0.5443038 0.49635036 0.57142857 0.4822695 ]
mean value: 0.513701849382651
key: train_fscore
value: [1. 1. 1. 1. 1. 0.99935525
1. 1. 1. 0.99935608]
mean value: 0.9998711339671184
key: test_precision
value: [0.53030303 0.57352941 0.56451613 0.75438596 0.56451613 0.58333333
0.6056338 0.68 0.68852459 0.61818182]
mean value: 0.616292420954052
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.40697674 0.45348837 0.40697674 0.5 0.40697674 0.48275862
0.49425287 0.3908046 0.48837209 0.39534884]
mean value: 0.44259556268377437
key: train_recall
value: [1. 1. 1. 1. 1. 0.99871134
1. 1. 1. 0.998713 ]
mean value: 0.9997424338919183
key: test_accuracy
value: [0.75449102 0.77245509 0.76646707 0.82934132 0.76646707 0.7754491
0.78443114 0.79341317 0.81081081 0.78078078]
mean value: 0.7834106561651472
key: train_accuracy
value: [1. 1. 1. 1. 1. 0.99966711
1. 1. 1. 0.99966722]
mean value: 0.9999334331817146
key: test_roc_auc
value: [0.64098837 0.66827644 0.64905289 0.72177419 0.64905289 0.68065057
0.69044627 0.66301363 0.70572451 0.6551643 ]
mean value: 0.6724144066520609
key: train_roc_auc
value: [1. 1. 1. 1. 1. 0.99935567
1. 1. 1. 0.9993565 ]
mean value: 0.9998712169459593
key: test_jcc
value: [0.2991453 0.33913043 0.30973451 0.43 0.30973451 0.35897436
0.37391304 0.33009709 0.4 0.31775701]
mean value: 0.3468486259653635
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
from pandas import MultiIndex, Int64Index
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
from pandas import MultiIndex, Int64Index
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
from pandas import MultiIndex, Int64Index
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
from pandas import MultiIndex, Int64Index
/home/tanu/git/LSHTM_analysis/scripts/ml/ml_functions/MultClfs_logo_skf.py:419: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
scoresDF_CV['source_data'] = 'CV'
/home/tanu/git/LSHTM_analysis/scripts/ml/ml_functions/MultClfs_logo_skf.py:446: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
scoresDF_BT['source_data'] = 'BT'
key: train_jcc
value: [1. 1. 1. 1. 1. 0.99871134
1. 1. 1. 0.998713 ]
mean value: 0.9997424338919183
MCC on Blind test: 0.37
Accuracy on Blind test: 0.7
Extracting tts_split_name: logo_skf_BT_pnca
Total cols in each df:
CV df: 8
metaDF: 17
Adding column: Model_name
Total cols in bts df:
BT_df: 8
First proceeding to rowbind CV and BT dfs:
Final output should have: 25 columns
Combinig 2 using pd.concat by row ~ rowbind
Checking Dims of df to combine:
Dim of CV: (24, 8)
Dim of BT: (24, 8)
8
Number of Common columns: 8
These are: ['Precision', 'Accuracy', 'source_data', 'F1', 'Recall', 'MCC', 'ROC_AUC', 'JCC']
Concatenating dfs with different resampling methods [WF]:
Split type: logo_skf_BT_pnca
No. of dfs combining: 2
PASS: 2 dfs successfully combined
nrows in combined_df_wf: 48
ncols in combined_df_wf: 8
PASS: proceeding to merge metadata with CV and BT dfs
Adding column: Model_name
=========================================================
SUCCESS: Ran multiple classifiers
=======================================================
BTS gene: gid
Total genes: 6
Training on: 4
Training on genes: ['katg', 'pnca', 'rpob', 'embb']
Omitted genes: ['alr', 'gid']
Blind test gene: gid
/home/tanu/git/Data/ml_combined/5genes_logo_skf_BT_gid.csv
Training data dim: (3231, 171)
Training Target dim: (3231,)
Checked training df does NOT have Target var
TEST data dim: (531, 171)
TEST Target dim: (531,)
==============================================================
Running several classification models (n): 24
List of models:
('AdaBoost Classifier', AdaBoostClassifier(random_state=42))
('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42))
('Decision Tree', DecisionTreeClassifier(random_state=42))
('Extra Tree', ExtraTreeClassifier(random_state=42))
('Extra Trees', ExtraTreesClassifier(random_state=42))
('Gradient Boosting', GradientBoostingClassifier(random_state=42))
('Gaussian NB', GaussianNB())
('Gaussian Process', GaussianProcessClassifier(random_state=42))
('K-Nearest Neighbors', KNeighborsClassifier())
('LDA', LinearDiscriminantAnalysis())
('Logistic Regression', LogisticRegression(random_state=42))
('Logistic RegressionCV', LogisticRegressionCV(cv=3, random_state=42))
('MLP', MLPClassifier(max_iter=500, random_state=42))
('Multinomial', MultinomialNB())
('Naive Bayes', BernoulliNB())
('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42))
('QDA', QuadraticDiscriminantAnalysis())
('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42))
('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
n_estimators=1000, n_jobs=10, oob_score=True,
random_state=42))
('Ridge Classifier', RidgeClassifier(random_state=42))
('Ridge ClassifierCV', RidgeClassifierCV(cv=3))
('SVC', SVC(random_state=42))
('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42))
('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
colsample_bynode=None, colsample_bytree=None,
enable_categorical=False, gamma=None, gpu_id=None,
importance_type=None, interaction_constraints=None,
learning_rate=None, max_delta_step=None, max_depth=None,
min_child_weight=None, missing=nan, monotone_constraints=None,
n_estimators=100, n_jobs=None, num_parallel_tree=None,
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
scale_pos_weight=None, subsample=None, tree_method=None,
use_label_encoder=False, validate_parameters=None, verbosity=0))
================================================================
Running classifier: 1
Model_name: AdaBoost Classifier
Model func: AdaBoostClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', AdaBoostClassifier(random_state=42))])
key: fit_time
value: [0.71346688 0.70873904 0.70537329 0.70846152 0.71706867 0.70674109
0.72169304 0.70572209 0.70857334 0.73036671]
mean value: 0.7126205682754516
key: score_time
value: [0.01884413 0.01858401 0.01866388 0.01852965 0.01893687 0.01872158
0.02005959 0.01865435 0.01872373 0.01847363]
mean value: 0.018819141387939452
key: test_mcc
value: [0.43626146 0.45495846 0.46173 0.39527106 0.41984123 0.36247866
0.38120285 0.4064668 0.41682989 0.46695327]
mean value: 0.420199368996363
key: train_mcc
value: [0.51461155 0.50124637 0.50745661 0.51426412 0.51899509 0.53297367
0.50668556 0.50932975 0.52108535 0.49615802]
mean value: 0.5122806095857543
key: test_fscore
value: [0.61083744 0.63551402 0.63461538 0.55913978 0.59183673 0.54450262
0.55958549 0.58163265 0.59296482 0.60962567]
mean value: 0.5920254617030783
key: train_fscore
value: [0.6572238 0.64980326 0.65626741 0.65879708 0.66178953 0.67006225
0.6487106 0.65264355 0.66137266 0.63588545]
mean value: 0.6552555584633271
key: test_precision
value: [0.65263158 0.63551402 0.65346535 0.65822785 0.65168539 0.61904762
0.63529412 0.64772727 0.64835165 0.72151899]
mean value: 0.6523463830648676
key: train_precision
value: [0.72681704 0.71270037 0.71221282 0.72256473 0.72682324 0.74092616
0.72750643 0.72474747 0.73241206 0.7311828 ]
mean value: 0.7257893118574106
key: test_recall
value: [0.57407407 0.63551402 0.61682243 0.48598131 0.54205607 0.48598131
0.5 0.52777778 0.5462963 0.52777778]
mean value: 0.5442281066112842
key: train_recall
value: [0.59979317 0.59710744 0.60847107 0.6053719 0.60743802 0.61157025
0.58531541 0.59358842 0.60289555 0.56256463]
mean value: 0.5974115864862787
key: test_accuracy
value: [0.75617284 0.75851393 0.76470588 0.74613003 0.75232198 0.73065015
0.73684211 0.74613003 0.74922601 0.77399381]
mean value: 0.7514686771394717
key: train_accuracy
value: [0.79188166 0.78576341 0.78782669 0.79126547 0.79332875 0.79951857
0.7892022 0.78988996 0.79470426 0.78576341]
mean value: 0.7909144388468
key: test_roc_auc
value: [0.71064815 0.72747923 0.7273927 0.68049065 0.69926878 0.66891658
0.67790698 0.69179587 0.69872954 0.7127261 ]
mean value: 0.6995354572677437
key: train_roc_auc
value: [0.74371102 0.73850217 0.74289533 0.74469626 0.74676025 0.75243461
0.73804668 0.74063759 0.74657915 0.72976248]
mean value: 0.7424025538671135
key: test_jcc
value: [0.43971631 0.46575342 0.46478873 0.3880597 0.42028986 0.37410072
0.38848921 0.41007194 0.42142857 0.43846154]
mean value: 0.4211160006067346
key: train_jcc
value: [0.48945148 0.48126561 0.48839138 0.49119866 0.49453322 0.50382979
0.48006785 0.48438819 0.4940678 0.46615253]
mean value: 0.487334649673293
MCC on Blind test: 0.04
Accuracy on Blind test: 0.72
Running classifier: 2
Model_name: Bagging Classifier
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model',
BaggingClassifier(n_jobs=10, oob_score=True,
random_state=42))])
key: fit_time
value: [0.37061739 0.39370966 0.396842 0.3951602 0.40427089 0.39400506
0.39814663 0.40049386 0.40566206 0.39216852]
mean value: 0.395107626914978
key: score_time
value: [0.05094719 0.03579354 0.04647493 0.03929424 0.05052805 0.03775835
0.03286672 0.04663634 0.04964089 0.04799962]
mean value: 0.043793988227844236
key: test_mcc
value: [0.40996925 0.453128 0.37996811 0.40896849 0.40379053 0.40558762
0.38950215 0.46059499 0.35626965 0.39056578]
mean value: 0.40583445799430695
key: train_mcc
value: [0.95056501 0.96061292 0.96221225 0.95366911 0.95370518 0.96678941
0.95843678 0.95905462 0.96219183 0.9628472 ]
mean value: 0.9590084307857054
key: test_fscore
value: [0.57591623 0.62 0.5483871 0.56521739 0.56684492 0.57142857
0.52873563 0.62311558 0.52459016 0.53409091]
mean value: 0.5658326492758391
key: train_fscore
value: [0.96595745 0.97305864 0.97408778 0.96818664 0.96815287 0.9773565
0.9712766 0.97194283 0.97406035 0.97471022]
mean value: 0.9718789869500656
key: test_precision
value: [0.6626506 0.66666667 0.64556962 0.67532468 0.6625 0.65853659
0.6969697 0.68131868 0.64 0.69117647]
mean value: 0.6680712998896612
key: train_precision
value: [0.99452355 0.99567568 0.99783315 0.99455338 0.99563319 0.99677766
1. 0.99566161 0.9978308 0.99355532]
mean value: 0.9962044324962559
key: test_recall
value: [0.50925926 0.57943925 0.47663551 0.48598131 0.4953271 0.5046729
0.42592593 0.57407407 0.44444444 0.43518519]
mean value: 0.4930944963655245
key: train_recall
value: [0.93898656 0.95144628 0.95144628 0.94318182 0.94214876 0.95867769
0.94415719 0.94932782 0.95139607 0.9565667 ]
mean value: 0.9487335159434906
key: test_accuracy
value: [0.75 0.76470588 0.73993808 0.75232198 0.74922601 0.74922601
0.74613003 0.76780186 0.73065015 0.74613003]
mean value: 0.7496130030959752
key: train_accuracy
value: [0.97798418 0.98246217 0.98314993 0.97936726 0.97936726 0.9852132
0.98143054 0.98177442 0.98314993 0.98349381]
mean value: 0.9817392704324666
key: test_roc_auc
value: [0.68981481 0.71796037 0.67350294 0.68512028 0.68516355 0.68752163
0.66645134 0.71959518 0.65943152 0.66875538]
mean value: 0.6853317012404712
key: train_roc_auc
value: [0.96820462 0.97469221 0.97520768 0.97030225 0.97004345 0.97856565
0.97207859 0.97363351 0.97518284 0.97673776]
mean value: 0.9734648554557713
key: test_jcc
value: [0.40441176 0.44927536 0.37777778 0.39393939 0.39552239 0.4
0.359375 0.45255474 0.35555556 0.36434109]
mean value: 0.3952753072154017
key: train_jcc
value: [0.93415638 0.94753086 0.94948454 0.93833505 0.9382716 0.95571576
0.94415719 0.9454171 0.9494324 0.95066804]
mean value: 0.9453168911513533
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/joblib/externals/loky/process_executor.py:702: UserWarning: A worker stopped while some jobs were given to the executor. This can be caused by a too short worker timeout or by a memory leak.
warnings.warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
MCC on Blind test: 0.0
Accuracy on Blind test: 0.69
Running classifier: 3
Model_name: Decision Tree
Model func: DecisionTreeClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', DecisionTreeClassifier(random_state=42))])
key: fit_time
value: [0.21111059 0.2128818 0.21197772 0.21439314 0.20772195 0.20596623
0.21204734 0.21391106 0.21615005 0.20947742]
mean value: 0.21156373023986816
key: score_time
value: [0.01074457 0.01070642 0.01034093 0.01018333 0.01008654 0.01032805
0.01021576 0.00988364 0.01014543 0.01007318]
mean value: 0.010270786285400391
key: test_mcc
value: [0.34681827 0.38273145 0.31124352 0.3498875 0.26476283 0.2664641
0.27851042 0.33229974 0.26637088 0.27333608]
mean value: 0.3072424801565944
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.55924171 0.6 0.54545455 0.5388601 0.5158371 0.5047619
0.53275109 0.55555556 0.52212389 0.51401869]
mean value: 0.5388604596729636
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.57281553 0.56097561 0.53097345 0.60465116 0.5 0.51456311
0.50413223 0.55555556 0.5 0.51886792]
mean value: 0.5362534576139744
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.5462963 0.64485981 0.56074766 0.48598131 0.53271028 0.4953271
0.56481481 0.55555556 0.5462963 0.50925926]
mean value: 0.5441848390446522
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.71296296 0.71517028 0.69040248 0.7244582 0.66873065 0.67801858
0.66873065 0.70278638 0.66563467 0.67801858]
mean value: 0.6904913427359247
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: [0.6712963 0.69742991 0.65768865 0.66428695 0.6344107 0.63192281
0.64287252 0.66614987 0.63593885 0.63602498]
mean value: 0.6538021525111288
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.38815789 0.42857143 0.375 0.36879433 0.34756098 0.33757962
0.36309524 0.38461538 0.35329341 0.34591195]
mean value: 0.3692580228563366
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: -0.01
Accuracy on Blind test: 0.52
Running classifier: 4
Model_name: Extra Tree
Model func: ExtraTreeClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', ExtraTreeClassifier(random_state=42))])
key: fit_time
value: [0.02205777 0.02197671 0.02214074 0.02183557 0.02188349 0.02228212
0.02192092 0.02194476 0.0221591 0.02229953]
mean value: 0.022050070762634277
key: score_time
value: [0.01012254 0.01007295 0.00998473 0.00998831 0.01003695 0.00991249
0.01000834 0.00998783 0.01036167 0.01018929]
mean value: 0.010066509246826172
key: test_mcc
value: [0.19215607 0.26280481 0.29904407 0.33691135 0.26632586 0.24736256
0.15647975 0.21743135 0.14854024 0.38181852]
mean value: 0.25088745754828046
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.4549763 0.50925926 0.55084746 0.54 0.51376147 0.50228311
0.4372093 0.47663551 0.44444444 0.6 ]
mean value: 0.5029416853905369
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.46601942 0.50458716 0.50387597 0.58064516 0.5045045 0.49107143
0.43925234 0.48113208 0.42735043 0.56557377]
mean value: 0.49640122465600617
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.44444444 0.51401869 0.60747664 0.5046729 0.52336449 0.51401869
0.43518519 0.47222222 0.46296296 0.63888889]
mean value: 0.5117255105572862
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.64506173 0.67182663 0.67182663 0.71517028 0.67182663 0.6625387
0.625387 0.65325077 0.6130031 0.71517028]
mean value: 0.6645061728395062
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: [0.59490741 0.63200935 0.65559017 0.66205867 0.63436743 0.6250649
0.57805771 0.60820413 0.57566753 0.69618863]
mean value: 0.6262115924879453
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.29447853 0.34161491 0.38011696 0.36986301 0.34567901 0.33536585
0.2797619 0.31288344 0.28571429 0.42857143]
mean value: 0.3374049327837274
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: -0.07
Accuracy on Blind test: 0.56
Running classifier: 5
Model_name: Extra Trees
Model func: ExtraTreesClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', ExtraTreesClassifier(random_state=42))])
key: fit_time
value: [0.55033875 0.50822544 0.51796842 0.50145197 0.50207162 0.50933361
0.51347113 0.52098894 0.53057575 0.52549815]
mean value: 0.517992377281189
key: score_time
value: [0.02689409 0.02533078 0.02526283 0.02533364 0.02574825 0.02521038
0.02508998 0.02556133 0.02518654 0.02619076]
mean value: 0.025580859184265135
key: test_mcc
value: [0.30628195 0.43266022 0.41057402 0.39487402 0.40558762 0.34513984
0.28374537 0.35626965 0.34395674 0.46806941]
mean value: 0.3747158836650517
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.49462366 0.59793814 0.56989247 0.53714286 0.57142857 0.51648352
0.48128342 0.52459016 0.50561798 0.61375661]
mean value: 0.5412757396096122
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.58974359 0.66666667 0.67088608 0.69117647 0.65853659 0.62666667
0.56962025 0.64 0.64285714 0.71604938]
mean value: 0.6472202833718128
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.42592593 0.54205607 0.4953271 0.43925234 0.5046729 0.43925234
0.41666667 0.44444444 0.41666667 0.53703704]
mean value: 0.4661301488404293
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.70987654 0.75851393 0.75232198 0.74922601 0.74922601 0.72755418
0.6996904 0.73065015 0.72755418 0.77399381]
mean value: 0.7378607193364676
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: [0.63888889 0.70389841 0.68747837 0.67101506 0.68752163 0.65481135
0.62926357 0.65943152 0.6501938 0.71503015]
mean value: 0.6697532742479493
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.32857143 0.42647059 0.39849624 0.3671875 0.4 0.34814815
0.31690141 0.35555556 0.33834586 0.44274809]
mean value: 0.37224248258273424
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: 0.03
Accuracy on Blind test: 0.65
Running classifier: 6
Model_name: Gradient Boosting
Model func: GradientBoostingClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', GradientBoostingClassifier(random_state=42))])
key: fit_time
value: [3.46885061 3.44125056 3.39057827 3.59237599 3.44193816 3.42106009
3.42562819 3.43559313 3.43834209 3.49363422]
mean value: 3.4549251317977907
key: score_time
value: [0.01116729 0.01049709 0.01061249 0.01098275 0.0105741 0.01058292
0.01081872 0.01054811 0.01067543 0.01055026]
mean value: 0.01070091724395752
key: test_mcc
value: [0.46315109 0.50482163 0.50679775 0.42907005 0.39666624 0.40044611
0.43641862 0.47874557 0.39420092 0.5437686 ]
mean value: 0.4554086583988184
key: train_mcc
value: [0.65910978 0.65204058 0.64702346 0.64941791 0.65772915 0.63945461
0.64752473 0.65588565 0.64508648 0.64336148]
mean value: 0.6496633835552068
key: test_fscore
value: [0.61458333 0.65671642 0.64948454 0.5698324 0.57435897 0.55737705
0.58823529 0.62886598 0.56544503 0.66666667]
mean value: 0.6071565679443962
key: train_fscore
value: [0.75317186 0.74369501 0.7458405 0.74477958 0.75086906 0.73842593
0.7427241 0.75200918 0.74351585 0.73905429]
mean value: 0.7454085369718941
key: test_precision
value: [0.70238095 0.70212766 0.72413793 0.70833333 0.63636364 0.67105263
0.69620253 0.70930233 0.65060241 0.77777778]
mean value: 0.6978281188909117
key: train_precision
value: [0.85136897 0.86024423 0.83870968 0.84920635 0.85488127 0.83947368
0.84953395 0.84516129 0.83984375 0.84852547]
mean value: 0.8476948644937107
key: test_recall
value: [0.5462963 0.61682243 0.58878505 0.47663551 0.52336449 0.47663551
0.50925926 0.56481481 0.5 0.58333333]
mean value: 0.5385946694357908
key: train_recall
value: [0.67528438 0.65495868 0.6714876 0.66322314 0.66942149 0.65909091
0.65977249 0.67735264 0.66701138 0.65460186]
mean value: 0.6652204568957414
key: test_accuracy
value: [0.77160494 0.78637771 0.78947368 0.76160991 0.74303406 0.74922601
0.76160991 0.77708978 0.74303406 0.80495356]
mean value: 0.7688013607002254
key: train_accuracy
value: [0.85276918 0.8497249 0.84766162 0.84869326 0.85213205 0.84456671
0.8480055 0.85144429 0.84697387 0.84628611]
mean value: 0.8488257485962121
key: test_roc_auc
value: [0.71527778 0.7435964 0.73883697 0.68970665 0.68760817 0.68044739
0.69881568 0.72426787 0.68255814 0.7498062 ]
mean value: 0.711092123692917
key: train_roc_auc
value: [0.80826075 0.80093295 0.80352731 0.80223013 0.80636023 0.79810216
0.80077754 0.80776442 0.80182099 0.79819222]
mean value: 0.8027968683454685
key: test_jcc
value: [0.44360902 0.48888889 0.48091603 0.3984375 0.4028777 0.38636364
0.41666667 0.45864662 0.39416058 0.5 ]
mean value: 0.43705666433346196
key: train_jcc
value: [0.60407031 0.59197012 0.5946935 0.59334566 0.60111317 0.5853211
0.59074074 0.6025759 0.59174312 0.58611111]
mean value: 0.594168472850533
MCC on Blind test: -0.03
Accuracy on Blind test: 0.68
Running classifier: 7
Model_name: Gaussian NB
Model func: GaussianNB()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', GaussianNB())])
key: fit_time
value: [0.02324271 0.02277565 0.02242827 0.02272677 0.02300692 0.02192473
0.0234952 0.02331281 0.02321529 0.02152634]
mean value: 0.022765469551086426
key: score_time
value: [0.01098967 0.01065516 0.01089811 0.01105499 0.01126099 0.01093316
0.01130462 0.01136613 0.01045918 0.01127911]
mean value: 0.01102011203765869
key: test_mcc
value: [0.19655341 0.32889594 0.35337809 0.31621165 0.2856137 0.3221694
0.2587656 0.27594703 0.24317621 0.38654397]
mean value: 0.296725500260769
key: train_mcc
value: [0.31233322 0.30160683 0.29935751 0.30375667 0.30452207 0.31823447
0.31042028 0.309612 0.30801867 0.30138472]
mean value: 0.3069246439778996
key: test_fscore
value: [0.50757576 0.58333333 0.59109312 0.56302521 0.54237288 0.57258065
0.53441296 0.54618474 0.53435115 0.61290323]
mean value: 0.5587833010185285
key: train_fscore
value: [0.56864482 0.55956679 0.55884996 0.56398941 0.56502636 0.57318681
0.56645852 0.56674058 0.56742557 0.56028687]
mean value: 0.5650175676107145
key: test_precision
value: [0.42948718 0.49044586 0.52142857 0.51145038 0.49612403 0.5035461
0.47482014 0.4822695 0.45454545 0.54285714]
mean value: 0.4906974367599872
key: train_precision
value: [0.49728892 0.49679487 0.49443561 0.49229584 0.49159021 0.49885233
0.49803922 0.49611801 0.49202733 0.49446203]
mean value: 0.49519043828789455
key: test_recall
value: [0.62037037 0.71962617 0.68224299 0.62616822 0.59813084 0.6635514
0.61111111 0.62962963 0.64814815 0.7037037 ]
mean value: 0.6502682589131187
key: train_recall
value: [0.663909 0.64049587 0.64256198 0.66012397 0.6642562 0.67355372
0.65667011 0.66080662 0.67011375 0.64632885]
mean value: 0.6578820070594068
key: test_accuracy
value: [0.59876543 0.65944272 0.6873065 0.67801858 0.66563467 0.67182663
0.64396285 0.6501548 0.62229102 0.70278638]
mean value: 0.6580189580705575
key: train_accuracy
value: [0.66494668 0.66437414 0.66231087 0.66024759 0.65955983 0.66609354
0.66574966 0.66403026 0.66024759 0.66265475]
mean value: 0.6630214906011151
key: test_roc_auc
value: [0.60416667 0.6746279 0.6860289 0.66493596 0.64860246 0.66973866
0.63578811 0.64504737 0.62872524 0.70301464]
mean value: 0.6560675919888591
key: train_roc_auc
value: [0.66468646 0.65839226 0.65736347 0.66021662 0.66073635 0.66796243
0.66347158 0.66322144 0.66272303 0.65855855]
mean value: 0.661733219254407
key: test_jcc
value: [0.34010152 0.41176471 0.41954023 0.39181287 0.37209302 0.40112994
0.36464088 0.37569061 0.36458333 0.44186047]
mean value: 0.38832175810280845
key: train_jcc
value: [0.39727723 0.38847118 0.38778055 0.39274739 0.39375383 0.4017252
0.39514624 0.39542079 0.39608802 0.38916563]
mean value: 0.39375760454362546
MCC on Blind test: 0.07
Accuracy on Blind test: 0.24
Running classifier: 8
Model_name: Gaussian Process
Model func: GaussianProcessClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', GaussianProcessClassifier(random_state=42))])
key: fit_time
value: [3.72386813 3.46688414 3.49233007 3.58388376 3.65783072 3.57631755
3.61510754 3.50474644 3.65729499 3.54151773]
mean value: 3.581978106498718
key: score_time
value: [0.09795856 0.10127211 0.09668136 0.10716033 0.0966568 0.11194348
0.1075511 0.10887027 0.1460917 0.09752226]
mean value: 0.10717079639434815
key: test_mcc
value: [0.29973753 0.3493622 0.36024678 0.36365071 0.26541227 0.33793216
0.2804783 0.3301645 0.31956107 0.31813132]
mean value: 0.3224676847114808
key: train_mcc
value: [0.62941564 0.61565615 0.60100533 0.62064547 0.62779173 0.62720294
0.62198166 0.63468891 0.6273002 0.60737338]
mean value: 0.6213061401535909
key: test_fscore
value: [0.46242775 0.52688172 0.53968254 0.48148148 0.42857143 0.46625767
0.44444444 0.47953216 0.46428571 0.45783133]
mean value: 0.4751396232316007
key: train_fscore
value: [0.71234568 0.69559748 0.68710889 0.70391061 0.70851461 0.70698254
0.70175439 0.71277259 0.70551378 0.68289558]
mean value: 0.7017396148760096
key: test_precision
value: [0.61538462 0.62025316 0.62195122 0.70909091 0.59016393 0.67857143
0.6031746 0.65079365 0.65 0.65517241]
mean value: 0.6394555939303698
key: train_precision
value: [0.88361409 0.88906752 0.87142857 0.88180404 0.88923557 0.89150943
0.89030207 0.89655172 0.89507154 0.8973064 ]
mean value: 0.8885890961643422
key: test_recall
value: [0.37037037 0.45794393 0.47663551 0.36448598 0.3364486 0.35514019
0.35185185 0.37962963 0.36111111 0.35185185]
mean value: 0.3805469020422291
key: train_recall
value: [0.5966908 0.57128099 0.56714876 0.5857438 0.58884298 0.5857438
0.57911065 0.59152017 0.58221303 0.55118925]
mean value: 0.5799484218892887
key: test_accuracy
value: [0.71296296 0.72755418 0.73065015 0.73993808 0.70278638 0.73065015
0.70588235 0.7244582 0.72136223 0.72136223]
mean value: 0.7217606925811261
key: train_accuracy
value: [0.83969728 0.83356259 0.82806052 0.83596974 0.83872077 0.83837689
0.83631362 0.8414718 0.83837689 0.82977992]
mean value: 0.8360330019698219
key: test_roc_auc
value: [0.62731481 0.65952752 0.6665585 0.64520595 0.61035393 0.63590343
0.61778639 0.63865202 0.63171835 0.6294143 ]
mean value: 0.6362435199272299
key: train_roc_auc
value: [0.77875777 0.76785699 0.76269809 0.77328427 0.77612252 0.7750884
0.77178098 0.77875854 0.77410497 0.75988107]
mean value: 0.7718333599644396
key: test_jcc
value: [0.30075188 0.35766423 0.36956522 0.31707317 0.27272727 0.304
0.28571429 0.31538462 0.30232558 0.296875 ]
mean value: 0.3122081256620425
key: train_jcc
value: [0.55321189 0.53326905 0.52335558 0.54310345 0.54860443 0.54676953
0.54054054 0.55372701 0.54501452 0.51848249]
mean value: 0.5406078474276688
MCC on Blind test: 0.01
Accuracy on Blind test: 0.65
Running classifier: 9
Model_name: K-Nearest Neighbors
Model func: KNeighborsClassifier()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', KNeighborsClassifier())])
key: fit_time
value: [0.02330852 0.01772833 0.01797438 0.01814914 0.01744699 0.01770377
0.01865363 0.01959276 0.01986504 0.01938367]
mean value: 0.01898062229156494
key: score_time
value: [0.05328751 0.02985787 0.02699709 0.04000139 0.028054 0.02597117
0.02999616 0.03015733 0.02659822 0.02819324]
mean value: 0.031911396980285646
key: test_mcc
value: [0.17789202 0.26652129 0.28438515 0.37623149 0.22049776 0.20828383
0.1941968 0.20701415 0.12496999 0.25663526]
mean value: 0.23166277434301435
key: train_mcc
value: [0.5057614 0.49524884 0.47676866 0.49203917 0.49489219 0.48824642
0.50982968 0.51580117 0.51216208 0.49014613]
mean value: 0.498089574244585
key: test_fscore
value: [0.40641711 0.47916667 0.49484536 0.51461988 0.4180791 0.42391304
0.3908046 0.41758242 0.35555556 0.45652174]
mean value: 0.43575054723248263
key: train_fscore
value: [0.63862559 0.62843489 0.60948905 0.62693683 0.62665066 0.62275449
0.64052288 0.64489311 0.64123957 0.61829268]
mean value: 0.629783975215296
key: test_precision
value: [0.48101266 0.54117647 0.55172414 0.6875 0.52857143 0.50649351
0.51515152 0.51351351 0.44444444 0.55263158]
mean value: 0.5322219253868894
key: train_precision
value: [0.74757282 0.74504249 0.74112426 0.74084507 0.747851 0.74074074
0.7527933 0.75732218 0.75668073 0.75334324]
mean value: 0.7483315825248681
key: test_recall
value: [0.35185185 0.42990654 0.44859813 0.41121495 0.34579439 0.36448598
0.31481481 0.35185185 0.2962963 0.38888889]
mean value: 0.37037037037037035
key: train_recall
value: [0.557394 0.54338843 0.51756198 0.54338843 0.5392562 0.53719008
0.557394 0.56153051 0.55635988 0.52430196]
mean value: 0.5437765475569838
key: test_accuracy
value: [0.65740741 0.69040248 0.69659443 0.74303406 0.68111455 0.67182663
0.67182663 0.67182663 0.64086687 0.69040248]
mean value: 0.6815302144249513
key: train_accuracy
value: [0.79016168 0.78610729 0.77922971 0.78473177 0.78610729 0.78335626
0.79195323 0.79436039 0.79298487 0.78473177]
mean value: 0.787372426467631
key: test_roc_auc
value: [0.58101852 0.62467549 0.63402129 0.65931118 0.59650831 0.59428003
0.5829888 0.592205 0.55512489 0.61537468]
mean value: 0.6035508182601205
key: train_roc_auc
value: [0.73178978 0.72530246 0.7136779 0.72427153 0.72426727 0.72168782
0.73310195 0.735943 0.73361528 0.71938952]
mean value: 0.7263046523058744
key: test_jcc
value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
[0.25503356 0.31506849 0.32876712 0.34645669 0.26428571 0.26896552
0.24285714 0.26388889 0.21621622 0.29577465]
mean value: 0.27973139937753877
key: train_jcc
value: [0.46910357 0.45818815 0.43832021 0.45659722 0.45629371 0.45217391
0.47115385 0.47589833 0.47192982 0.44748455]
mean value: 0.4597143332953504
MCC on Blind test: -0.02
Accuracy on Blind test: 0.64
Running classifier: 10
Model_name: LDA
Model func: LinearDiscriminantAnalysis()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', LinearDiscriminantAnalysis())])
key: fit_time
value: [0.11155748 0.13310885 0.11811233 0.11578465 0.12014413 0.11473989
0.11371827 0.11475849 0.11488843 0.11638999]
mean value: 0.11732025146484375
key: score_time
value: [0.0204742 0.01334834 0.01330829 0.0138762 0.01326299 0.01328826
0.01306581 0.01339364 0.01353717 0.01329374]
mean value: 0.014084863662719726
key: test_mcc
value: [0.42842071 0.48290931 0.49110899 0.35195063 0.45576365 0.29324576
0.39610584 0.41682989 0.37481434 0.43254438]
mean value: 0.4123693510378182
key: train_mcc
value: [0.48060422 0.47648791 0.48834241 0.49311027 0.48908078 0.49380377
0.48304439 0.49399322 0.49383289 0.47710005]
mean value: 0.4869399921887429
key: test_fscore
value: [0.59487179 0.65420561 0.66046512 0.51933702 0.61538462 0.5025641
0.56994819 0.59296482 0.55670103 0.59375 ]
mean value: 0.5860192294727739
key: train_fscore
value: [0.63425664 0.63080408 0.63827371 0.64418212 0.63863636 0.64577504
0.63574661 0.64245176 0.64204545 0.62984055]
mean value: 0.6382012324638342
key: test_precision
value: [0.66666667 0.65420561 0.65740741 0.63513514 0.68181818 0.55681818
0.64705882 0.64835165 0.62790698 0.67857143]
mean value: 0.6453940057518884
key: train_precision
value: [0.69950125 0.69799499 0.70870113 0.70653514 0.70959596 0.7045177
0.70162297 0.71194969 0.71248424 0.7008872 ]
mean value: 0.7053790268076962
key: test_recall
value: [0.53703704 0.65420561 0.6635514 0.43925234 0.56074766 0.45794393
0.50925926 0.5462963 0.5 0.52777778]
mean value: 0.539607130494981
key: train_recall
value: [0.58014478 0.57541322 0.58057851 0.59194215 0.58057851 0.59607438
0.5811789 0.58531541 0.58428128 0.57187177]
mean value: 0.5827378917500663
key: test_accuracy
value: [0.75617284 0.77089783 0.77399381 0.73065015 0.76780186 0.6996904
0.74303406 0.74922601 0.73374613 0.75851393]
mean value: 0.7483727019072737
key: train_accuracy
value: [0.77743378 0.77579092 0.78094911 0.78232462 0.78129298 0.78232462
0.77854195 0.78335626 0.78335626 0.77647868]
mean value: 0.7801849186306922
key: test_roc_auc
value: [0.70138889 0.74145466 0.74612755 0.65712617 0.71555902 0.63869418
0.68486219 0.69872954 0.6755814 0.70109819]
mean value: 0.6960621785119177
key: train_roc_auc
value: [0.72795899 0.72559321 0.73075317 0.73463087 0.73101091 0.73566606
0.72902325 0.73366749 0.73340803 0.72514248]
mean value: 0.7306854449694525
key: test_jcc
value: [0.42335766 0.48611111 0.49305556 0.35074627 0.44444444 0.33561644
0.39855072 0.42142857 0.38571429 0.42222222]
mean value: 0.4161247286360329
key: train_jcc
value: [0.46440397 0.46071133 0.46872394 0.47512438 0.46911519 0.4768595
0.46600332 0.47324415 0.47280335 0.45968412]
mean value: 0.4686673250244061
MCC on Blind test: -0.03
Accuracy on Blind test: 0.82
Running classifier: 11
Model_name: Logistic Regression
Model func: LogisticRegression(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', LogisticRegression(random_state=42))])
key: fit_time
value: [0.0803206 0.07417941 0.11596656 0.06954217 0.0704546 0.06977701
0.07444453 0.07653522 0.07818866 0.06757665]
mean value: 0.07769854068756103
key: score_time
value: [0.02215576 0.01816607 0.01759505 0.01781178 0.018121 0.01625586
0.01625538 0.01627874 0.01562166 0.01795173]
mean value: 0.017621302604675294
key: test_mcc
value: [0.4259217 0.52051826 0.51452739 0.38883337 0.4409893 0.36711753
0.37181194 0.42120675 0.37698689 0.41589376]
mean value: 0.4243806893752599
key: train_mcc
value: [0.46662064 0.45574607 0.45869489 0.46871112 0.45574607 0.46923374
0.46269655 0.46472588 0.46854074 0.44989202]
mean value: 0.46206077082863156
key: test_fscore
value: [0.6 0.67619048 0.67298578 0.53932584 0.60512821 0.55384615
0.53551913 0.59183673 0.56122449 0.57894737]
mean value: 0.5915004178445895
key: train_fscore
value: [0.6230813 0.61318052 0.61458932 0.62328767 0.61318052 0.62457338
0.61907481 0.61970218 0.62328767 0.6093929 ]
mean value: 0.618335025434744
key: test_precision
value: [0.65217391 0.68932039 0.68269231 0.67605634 0.67045455 0.61363636
0.65333333 0.65909091 0.625 0.67073171]
mean value: 0.6592489805945695
key: train_precision
value: [0.69191919 0.68854569 0.69210867 0.69642857 0.68854569 0.69493671
0.69132653 0.6944801 0.6955414 0.68292683]
mean value: 0.6916759380679195
key: test_recall
value: [0.55555556 0.6635514 0.6635514 0.44859813 0.55140187 0.5046729
0.4537037 0.53703704 0.50925926 0.50925926]
mean value: 0.5396590515749395
key: train_recall
value: [0.56670114 0.55268595 0.55268595 0.56404959 0.55268595 0.56714876
0.56049638 0.55946225 0.56463289 0.55015512]
mean value: 0.5590703974975855
key: test_accuracy
value: [0.75308642 0.78947368 0.78637771 0.74613003 0.76160991 0.73065015
0.73684211 0.75232198 0.73374613 0.75232198]
mean value: 0.7542560103963611
key: train_accuracy
value: [0.77192982 0.76788171 0.76925722 0.77303989 0.76788171 0.77303989
0.77063274 0.77166437 0.77303989 0.76547455]
mean value: 0.7703841791549024
key: test_roc_auc
value: [0.7037037 0.75770163 0.75538681 0.67105832 0.7085713 0.67363274
0.66638674 0.69875108 0.67788544 0.69183893]
mean value: 0.700491670490312
key: train_roc_auc
value: [0.72046397 0.71397184 0.71500277 0.72068459 0.71397184 0.72146098
0.71790919 0.71842252 0.72075024 0.71145056]
mean value: 0.7174088503366128
key: test_jcc
value: [0.42857143 0.51079137 0.50714286 0.36923077 0.43382353 0.38297872
0.36567164 0.42028986 0.39007092 0.40740741]
mean value: 0.4215978500924281
key: train_jcc
value: [0.45251858 0.44214876 0.44361526 0.45273632 0.44214876 0.45409429
0.44830438 0.44896266 0.45273632 0.43822076]
mean value: 0.4475486084230635
MCC on Blind test: 0.01
Accuracy on Blind test: 0.8
Running classifier: 12
Model_name: Logistic RegressionCV
Model func: LogisticRegressionCV(cv=3, random_state=42)
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', LogisticRegressionCV(cv=3, random_state=42))])
key: fit_time
value: [1.05378366 0.88285708 0.95072913 0.95866823 0.93553185 1.0282433
0.87286925 1.0207088 0.86641717 0.92480636]
mean value: 0.9494614839553833
key: score_time
value: [0.0133028 0.01337171 0.01374698 0.01709104 0.01687169 0.01574945
0.0133152 0.01565242 0.0132761 0.01375079]
mean value: 0.014612817764282226
key: test_mcc
value: [0.36946045 0.47399896 0.43654469 0.43560298 0.40729987 0.3710857
0.34430997 0.39172684 0.32802871 0.43198514]
mean value: 0.39900433008698216
key: train_mcc
value: [0.42817805 0.40232224 0.40972443 0.44823283 0.44713204 0.45584644
0.42053717 0.45318843 0.42818061 0.4065105 ]
mean value: 0.42998527571713535
key: test_fscore
value: [0.54255319 0.63316583 0.60606061 0.56818182 0.58585859 0.55208333
0.47904192 0.57142857 0.49438202 0.5698324 ]
mean value: 0.5602588276372217
key: train_fscore
value: [0.57594168 0.55426119 0.56078192 0.60779817 0.60654796 0.61030689
0.57297949 0.61168385 0.57815329 0.55521283]
mean value: 0.5833667263855598
key: test_precision
value: [0.6375 0.68478261 0.65934066 0.72463768 0.63736264 0.62352941
0.6779661 0.63636364 0.62857143 0.71830986]
mean value: 0.6628364024107984
key: train_precision
value: [0.69808542 0.68174962 0.68609865 0.68298969 0.68305304 0.69433465
0.68740955 0.68549422 0.6942029 0.68807339]
mean value: 0.6881491146835292
key: test_recall
value: [0.47222222 0.58878505 0.56074766 0.46728972 0.54205607 0.4953271
0.37037037 0.51851852 0.40740741 0.47222222]
mean value: 0.4894946348217376
key: train_recall
value: [0.4901758 0.46694215 0.47417355 0.54752066 0.54545455 0.54442149
0.49120993 0.55222337 0.49534643 0.46535677]
mean value: 0.5072824702795559
key: test_accuracy
value: [0.7345679 0.77399381 0.75851393 0.76470588 0.74613003 0.73374613
0.73065015 0.73993808 0.72136223 0.76160991]
mean value: 0.746521805603333
key: train_accuracy
value: [0.75988992 0.75 0.75275103 0.7647868 0.76444292 0.76856946
0.7565337 0.76685007 0.75962861 0.75206327]
mean value: 0.7595515780578019
key: test_roc_auc
value: [0.66898148 0.72726289 0.70861457 0.68966338 0.69463915 0.67358948
0.64099914 0.68484065 0.64323859 0.68959948]
mean value: 0.6821428815796888
key: train_roc_auc
value: [0.69225285 0.67908963 0.68296307 0.71035827 0.70958294 0.71241693
0.68996354 0.71299989 0.69331979 0.68012815]
mean value: 0.6963075061128654
key: test_jcc
value: [0.37226277 0.46323529 0.43478261 0.3968254 0.41428571 0.38129496
0.31496063 0.4 0.32835821 0.3984375 ]
mean value: 0.39044430905522987
key: train_jcc
value: [0.40443686 0.38337574 0.38964346 0.43657331 0.43528442 0.43916667
0.40152156 0.44059406 0.40662139 0.38428693]
mean value: 0.4121504403758539
MCC on Blind test: -0.02
Accuracy on Blind test: 0.64
Running classifier: 13
Model_name: MLP
Model func: MLPClassifier(max_iter=500, random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', MLPClassifier(max_iter=500, random_state=42))])
key: fit_time
value: [4.87325335 2.32943702 9.3246069 5.94298673 7.06027317 6.8252573
8.72361493 3.28780174 6.83220744 3.41255331]
mean value: 5.861199188232422
key: score_time
value: [0.01379395 0.01759601 0.01468706 0.0147028 0.01480675 0.01438427
0.0140655 0.01387405 0.01403809 0.01415563]
mean value: 0.0146104097366333
key: test_mcc
value: [0.298481 0.4758594 0.4921956 0.35885358 0.38254513 0.34558981
0.28776058 0.34672656 0.30408988 0.42720475]
mean value: 0.371930630284446
key: train_mcc
value: [0.55692239 0.47205977 0.60922644 0.60916873 0.59298979 0.59415497
0.58726012 0.48257795 0.60543142 0.51742943]
mean value: 0.5627221018468738
key: test_fscore
value: [0.52173913 0.63681592 0.65024631 0.52222222 0.58986175 0.55924171
0.44705882 0.49122807 0.5026178 0.6039604 ]
mean value: 0.552499212657852
key: train_fscore
value: [0.69936034 0.60428135 0.7098692 0.72009159 0.73313492 0.72612137
0.67341772 0.58815875 0.71995465 0.66256983]
mean value: 0.6836959721548931
key: test_precision
value: [0.54545455 0.68085106 0.6875 0.64383562 0.58181818 0.56730769
0.61290323 0.66666667 0.57831325 0.64893617]
mean value: 0.6213586415546495
key: train_precision
value: [0.72167217 0.74062969 0.83613445 0.80744544 0.70515267 0.74217907
0.86786297 0.79298246 0.79673777 0.72053463]
mean value: 0.7731331314237726
key: test_recall
value: [0.5 0.59813084 0.61682243 0.43925234 0.59813084 0.55140187
0.35185185 0.38888889 0.44444444 0.56481481]
mean value: 0.5053738317757009
key: train_recall
value: [0.67838676 0.51033058 0.61673554 0.64979339 0.76342975 0.7107438
0.55015512 0.46742503 0.65667011 0.61323681]
mean value: 0.6216906894459306
key: test_accuracy
value: [0.69444444 0.77399381 0.78018576 0.73374613 0.7244582 0.7120743
0.70897833 0.73065015 0.70588235 0.75232198]
mean value: 0.7316735466116271
key: train_accuracy
value: [0.80598555 0.77751032 0.83218707 0.83184319 0.81499312 0.82152682
0.82255846 0.78232462 0.8301238 0.79229711]
mean value: 0.8111350063807468
key: test_roc_auc
value: [0.64583333 0.72962098 0.73896677 0.65944098 0.69258394 0.67153427
0.62011197 0.64560724 0.64082687 0.70566322]
mean value: 0.6750189572315197
key: train_roc_auc
value: [0.7739872 0.71057766 0.77821313 0.7862369 0.8020757 0.79377396
0.75421203 0.70331581 0.78660399 0.7473706 ]
mean value: 0.7636366973576472
key: test_jcc
value: [0.35294118 0.46715328 0.48175182 0.35338346 0.41830065 0.38815789
0.28787879 0.3255814 0.33566434 0.43262411]
mean value: 0.3843436925305007
key: train_jcc
value: [0.53770492 0.43295355 0.55023041 0.56261181 0.57870008 0.57000829
0.50763359 0.41658986 0.56244464 0.49540518]
mean value: 0.5214282322836412
MCC on Blind test: -0.04
Accuracy on Blind test: 0.82
Running classifier: 14
Model_name: Multinomial
Model func: MultinomialNB()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', MultinomialNB())])
key: fit_time
value: [0.02973771 0.02446604 0.0251267 0.02736926 0.02445102 0.02433276
0.02474952 0.02446532 0.02481031 0.02451968]
mean value: 0.02540283203125
key: score_time
value: [0.01322079 0.01346278 0.01353502 0.01320624 0.01318502 0.01332021
0.01321387 0.01325846 0.01320601 0.01327467]
mean value: 0.013288307189941406
key: test_mcc
value: [0.05296271 0.3001869 0.28564351 0.27519915 0.27857129 0.35585282
0.24883721 0.30386217 0.19842526 0.34924255]
mean value: 0.26487835740747306
key: train_mcc
value: [0.28079183 0.27634235 0.26740109 0.26368405 0.26819748 0.27847136
0.2840767 0.27228523 0.26733525 0.27255459]
mean value: 0.2731139929765273
key: test_fscore
value: [0.40650407 0.54935622 0.53333333 0.51184834 0.51643192 0.57657658
0.5 0.54464286 0.49166667 0.56880734]
mean value: 0.5199167327500447
key: train_fscore
value: [0.53621103 0.52740741 0.52126607 0.52127139 0.52221125 0.52578868
0.53040877 0.52290837 0.52348337 0.52064451]
mean value: 0.5251600859889782
key: test_precision
value: [0.36231884 0.50793651 0.50847458 0.51923077 0.51886792 0.55652174
0.5 0.52586207 0.4469697 0.56363636]
mean value: 0.5009818487248489
key: train_precision
value: [0.5 0.50520341 0.5 0.49489322 0.5 0.51020408
0.5120308 0.50432277 0.49675023 0.50736016]
mean value: 0.503076466396964
key: test_recall
value: [0.46296296 0.59813084 0.56074766 0.5046729 0.51401869 0.59813084
0.5 0.56481481 0.5462963 0.57407407]
mean value: 0.5423849082727588
key: train_recall
value: [0.57807653 0.55165289 0.54442149 0.55061983 0.5464876 0.54235537
0.55015512 0.54291624 0.5532575 0.53464323]
mean value: 0.5494585794012324
key: test_accuracy
value: [0.54938272 0.6749226 0.6749226 0.68111455 0.68111455 0.70897833
0.66563467 0.68421053 0.62229102 0.70897833]
mean value: 0.6651549898711921
key: train_accuracy
value: [0.66735466 0.67090784 0.66712517 0.6633425 0.66712517 0.67434663
0.67606602 0.67056396 0.6650619 0.67262724]
mean value: 0.669452109857674
key: test_roc_auc
value: [0.52777778 0.6555469 0.64611457 0.63659571 0.63895379 0.68100987
0.6244186 0.65450043 0.60338071 0.67540913]
mean value: 0.6343707487100227
key: train_roc_auc
value: [0.6449661 0.64103263 0.636386 0.63510373 0.6369036 0.64128078
0.64447478 0.63853694 0.63701 0.63800683]
mean value: 0.6393701380251593
key: test_jcc
value: [0.25510204 0.37869822 0.36363636 0.34394904 0.34810127 0.40506329
0.33333333 0.37423313 0.32596685 0.3974359 ]
mean value: 0.352551944128509
key: train_jcc
value: [0.36631717 0.35814889 0.35250836 0.35251323 0.35337341 0.35665761
0.36092266 0.35401214 0.35453943 0.3519401 ]
mean value: 0.35609329957143715
MCC on Blind test: 0.04
Accuracy on Blind test: 0.46
Running classifier: 15
Model_name: Naive Bayes
Model func: BernoulliNB()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', BernoulliNB())])
key: fit_time
value: [0.02774858 0.02745104 0.02746153 0.02744007 0.02742839 0.02742028
0.02759576 0.02740383 0.02776742 0.02736163]
mean value: 0.027507853507995606
key: score_time
value: [0.01376963 0.01378465 0.01366854 0.01373482 0.01386881 0.01383877
0.01366878 0.01372933 0.01377916 0.01388311]
mean value: 0.01377255916595459
key: test_mcc
value: [0.05155407 0.1657651 0.28145907 0.19183244 0.27856069 0.23301345
0.14830358 0.23217293 0.18524387 0.20740874]
mean value: 0.19753139266440958
key: train_mcc
value: [0.23350539 0.23313838 0.21965476 0.22351857 0.22857142 0.23447876
0.24619127 0.23193989 0.22842125 0.24420621]
mean value: 0.23236259167875425
key: test_fscore
value: [0.36018957 0.41237113 0.48958333 0.42708333 0.48421053 0.44324324
0.39583333 0.45595855 0.44117647 0.42780749]
mean value: 0.4337456983481417
key: train_fscore
value: [0.47940075 0.45412311 0.44965675 0.46136743 0.446796 0.45995423
0.45744681 0.44457688 0.45892351 0.45045045]
mean value: 0.4562695923348358
key: test_precision
value: [0.36893204 0.45977011 0.55294118 0.48235294 0.55421687 0.52564103
0.45238095 0.51764706 0.46875 0.50632911]
mean value: 0.4888961289663977
key: train_precision
value: [0.49667406 0.51856764 0.50384615 0.49939832 0.51841746 0.51538462
0.5337931 0.52461322 0.5075188 0.53724928]
mean value: 0.5155462648827601
key: test_recall
value: [0.35185185 0.37383178 0.43925234 0.38317757 0.42990654 0.38317757
0.35185185 0.40740741 0.41666667 0.37037037]
mean value: 0.39074939425406713
key: train_recall
value: [0.46328852 0.40392562 0.40599174 0.42871901 0.39256198 0.41528926
0.40020683 0.38572906 0.4188211 0.38779731]
mean value: 0.4102330416128949
key: test_accuracy
value: [0.58333333 0.64705882 0.69659443 0.65944272 0.69659443 0.68111455
0.64086687 0.6749226 0.64705882 0.66873065]
mean value: 0.6595717234262126
key: train_accuracy
value: [0.66529068 0.67675378 0.66918845 0.66678129 0.6764099 0.67537827
0.68431912 0.67950481 0.6715956 0.68535076]
mean value: 0.6750572658417358
key: test_roc_auc
value: [0.52546296 0.57811959 0.63166321 0.58973693 0.62930512 0.60594064
0.56894918 0.60835487 0.58972868 0.59448751]
mean value: 0.5921748693923221
key: train_roc_auc
value: [0.61463395 0.60840611 0.6032536 0.60714301 0.60530161 0.61022195
0.61303489 0.60579601 0.60817407 0.61069412]
mean value: 0.6086659320462695
key: test_jcc
value: [0.21965318 0.25974026 0.32413793 0.27152318 0.31944444 0.28472222
0.24675325 0.29530201 0.28301887 0.27210884]
mean value: 0.27764041870781164
key: train_jcc
value: [0.31527094 0.29376409 0.2900369 0.29985549 0.28766086 0.2986627
0.29655172 0.28582375 0.29779412 0.29069767]
mean value: 0.2956118253096111
MCC on Blind test: 0.05
Accuracy on Blind test: 0.29
Running classifier: 16
Model_name: Passive Aggresive
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
warnings.warn("Variables are collinear")
Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model',
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
key: fit_time
value: [0.0460794 0.04580164 0.04810166 0.05260444 0.05596447 0.05800915
0.06486082 0.04544592 0.06838512 0.04734159]
mean value: 0.053259420394897464
key: score_time
value: [0.01168609 0.01323462 0.01321697 0.01321507 0.01370692 0.01315761
0.01221919 0.01319098 0.01156378 0.01320171]
mean value: 0.012839293479919434
key: test_mcc
value: [0.28203804 0.48516369 0.37778011 0.32130517 0.39237508 0.11195318
0.3261775 0.37409859 0.42746637 0.24483912]
mean value: 0.3343196829966986
key: train_mcc
value: [0.30266593 0.43586836 0.40034818 0.28159585 0.36458797 0.1372345
0.43635812 0.41751247 0.45041455 0.23393382]
mean value: 0.3460519759297837
key: test_fscore
value: [0.56862745 0.66371681 0.61433447 0.296875 0.62162162 0.50980392
0.48863636 0.6119403 0.63673469 0.54973822]
mean value: 0.556202885523636
key: train_fscore
value: [0.57845934 0.62226847 0.62700965 0.29840738 0.60813242 0.51548613
0.58746269 0.63610548 0.65358362 0.5443993 ]
mean value: 0.5671314472879306
key: test_precision
value: [0.43939394 0.6302521 0.48387097 0.9047619 0.48677249 0.34551495
0.63235294 0.5125 0.56934307 0.38321168]
mean value: 0.5387974035378734
key: train_precision
value: [0.44148068 0.62683438 0.51315789 0.79111111 0.46659304 0.34863388
0.69491525 0.52336449 0.55628177 0.37837838]
mean value: 0.5340750875277313
key: test_recall
value: [0.80555556 0.70093458 0.8411215 0.17757009 0.85981308 0.97196262
0.39814815 0.75925926 0.72222222 0.97222222]
mean value: 0.7208809276566286
key: train_recall
value: [0.83867632 0.6177686 0.80578512 0.1838843 0.87293388 0.98863636
0.50879007 0.81075491 0.79214064 0.97001034]
mean value: 0.7389380549881631
key: test_accuracy
value: [0.59259259 0.76470588 0.6501548 0.72136223 0.65325077 0.38080495
0.72136223 0.67801858 0.7244582 0.46749226]
mean value: 0.6354202499713335
key: train_accuracy
value: [0.59339525 0.75034388 0.68088033 0.71217331 0.62551582 0.38136176
0.76237964 0.69154058 0.72077029 0.46011004]
mean value: 0.6378470906207991
key: test_roc_auc
value: [0.64583333 0.74861544 0.69833853 0.58415542 0.70536951 0.52996279
0.64093454 0.69823428 0.72390181 0.59308786]
mean value: 0.6568433492718974
key: train_roc_auc
value: [0.65490517 0.71713172 0.71217091 0.57982875 0.68749787 0.53349344
0.69875361 0.72145164 0.73867722 0.58804484]
mean value: 0.6631955182497236
key: test_jcc
value: [0.39726027 0.49668874 0.44334975 0.17431193 0.45098039 0.34210526
0.32330827 0.44086022 0.46706587 0.37906137]
mean value: 0.39149920771443836
key: train_jcc
value: [0.40692423 0.45166163 0.45667447 0.17536946 0.4369183 0.34724238
0.4158918 0.46638905 0.48542459 0.37400319]
mean value: 0.40164991142526285
MCC on Blind test: 0.03
Accuracy on Blind test: 0.66
Running classifier: 17
Model_name: QDA
Model func: QuadraticDiscriminantAnalysis()
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', QuadraticDiscriminantAnalysis())])
key: fit_time
value: [0.06386518 0.06305885 0.06366539 0.06549168 0.06434202 0.06359529
0.06373453 0.06918001 0.06550789 0.06575584]
mean value: 0.06481966972351075
key: score_time
value: [0.01526618 0.01467252 0.0146482 0.01463866 0.01467323 0.01468325
0.01690769 0.01475978 0.01463628 0.01463294]
mean value: 0.014951872825622558
key: test_mcc
value: [0.07830353 0.04372614 0.10347461 0.13215514 0.12406471 0.1055599
0.08640593 0.09403875 0.08635497 0.10452277]
mean value: 0.0958606455656873
key: train_mcc
value: [0.14198367 0.14718414 0.14137651 0.14462715 0.14844883 0.14005845
0.1438705 0.14211294 0.14126706 0.14126706]
mean value: 0.14321962880403713
key: test_fscore
value: [0.5060241 0.49880096 0.50717703 0.51073986 0.51084337 0.50855746
0.50839329 0.50961538 0.50847458 0.51073986]
mean value: 0.5079365879679762
key: train_fscore
value: [0.51422494 0.51557923 0.51434644 0.51503059 0.51585398 0.51407329
0.51449854 0.514377 0.51395163 0.51395163]
mean value: 0.5145887274178269
key: test_precision
value: [0.34201954 0.33548387 0.34083601 0.34294872 0.34415584 0.34437086
0.34304207 0.34415584 0.3442623 0.34405145]
mean value: 0.3425326508215694
key: train_precision
value: [0.34609878 0.34732687 0.34620887 0.34682909 0.3475763 0.3459614
0.3463467 0.3463607 0.34585122 0.34585122]
mean value: 0.3464411163505491
key: test_recall
value: [0.97222222 0.97196262 0.99065421 1. 0.99065421 0.97196262
0.98148148 0.98148148 0.97222222 0.99074074]
mean value: 0.9823381793007961
key: train_recall
value: [1. 1. 1. 1. 1. 1.
1. 0.99896587 1. 1. ]
mean value: 0.9998965873836608
key: test_accuracy
value: [0.36728395 0.35294118 0.3622291 0.36532508 0.37151703 0.37770898
0.36532508 0.36842105 0.37151703 0.36532508]
mean value: 0.36675935481405036
key: train_accuracy
value: [0.37151703 0.37448418 0.37138927 0.37310867 0.37517194 0.37070151
0.37242091 0.37276479 0.37104539 0.37104539]
mean value: 0.37236490773823466
key: test_roc_auc
value: [0.51851852 0.50912946 0.52079007 0.52546296 0.52773451 0.52764798
0.51864772 0.5209733 0.52099483 0.52095177]
mean value: 0.5210851103222326
key: train_roc_auc
value: [0.52912371 0.53118557 0.52886598 0.53015464 0.53170103 0.52835052
0.5298815 0.52987964 0.52885111 0.52885111]
mean value: 0.5296844802679922
key: test_jcc
value: [0.33870968 0.33226837 0.33974359 0.34294872 0.34304207 0.34098361
0.34083601 0.34193548 0.34090909 0.34294872]
mean value: 0.3404325339063992
key: train_jcc
value: [0.34609878 0.34732687 0.34620887 0.34682909 0.3475763 0.3459614
0.3463467 0.34623656 0.34585122 0.34585122]
mean value: 0.346428701988443
MCC on Blind test: 0.03
Accuracy on Blind test: 0.08
Running classifier: 18
Model_name: Random Forest
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
warn(
Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model',
RandomForestClassifier(n_estimators=1000, random_state=42))])
key: fit_time
value: [9.01436353 8.96407723 9.21389389 8.98771 9.14480662 9.16903424
9.03453922 9.07725644 8.95009971 9.14932513]
mean value: 9.070510601997375
key: score_time
value: [0.14619231 0.14283562 0.14461136 0.13854027 0.14236856 0.14063048
0.13956451 0.14329457 0.14937806 0.13968325]
mean value: 0.14270989894866942
key: test_mcc
value: [0.37350894 0.50965676 0.48382521 0.41595314 0.41984123 0.36466515
0.3844447 0.46542662 0.39237022 0.46112903]
mean value: 0.4270821007757539
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.55208333 0.65656566 0.63212435 0.56830601 0.59183673 0.53763441
0.53631285 0.62244898 0.56084656 0.61052632]
mean value: 0.5868685201845468
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.63095238 0.71428571 0.70930233 0.68421053 0.65168539 0.63291139
0.67605634 0.69318182 0.65432099 0.70731707]
mean value: 0.6754223949833811
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.49074074 0.60747664 0.57009346 0.48598131 0.54205607 0.46728972
0.44444444 0.56481481 0.49074074 0.53703704]
mean value: 0.520067497403946
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.7345679 0.78947368 0.78018576 0.75541796 0.75232198 0.73374613
0.74303406 0.77089783 0.74303406 0.77089783]
mean value: 0.7573577189160263
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: [0.67361111 0.74355313 0.72717636 0.6874351 0.69926878 0.66651523
0.66873385 0.71961671 0.68025409 0.71270457]
mean value: 0.697886892543489
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.38129496 0.4887218 0.46212121 0.39694656 0.42028986 0.36764706
0.36641221 0.45185185 0.38970588 0.43939394]
mean value: 0.41643853467819475
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: 0.02
Accuracy on Blind test: 0.66
Running classifier: 19
Model_name: Random Forest2
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
n_estimators=1000, n_jobs=10, oob_score=True,
random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model',
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
n_estimators=1000, n_jobs=10,
oob_score=True, random_state=42))])
key: fit_time
value: [1.84367204 2.04128957 2.01582289 1.95663643 2.03340149 2.030375
1.96299672 2.03766584 1.94073915 2.00093246]
mean value: 1.9863531589508057
key: score_time
value: [0.37871456 0.34163904 0.38066006 0.32302594 0.3472302 0.38585329
0.36277223 0.33372927 0.32721639 0.37816954]
mean value: 0.35590105056762694
key: test_mcc
value: [0.39510519 0.50156793 0.46604207 0.44401062 0.42909624 0.37629696
0.36574203 0.45725434 0.35447189 0.48060795]
mean value: 0.42701952308297575
key: train_mcc
value: [0.8148669 0.80649928 0.8129697 0.80398911 0.81461086 0.8114267
0.80380552 0.80947506 0.81584791 0.79466277]
mean value: 0.8088153802650918
key: test_fscore
value: [0.54945055 0.64974619 0.61375661 0.57627119 0.58947368 0.53846154
0.51428571 0.61538462 0.51933702 0.61202186]
mean value: 0.5778188969381719
key: train_fscore
value: [0.86507937 0.85811966 0.86283438 0.85649203 0.86397268 0.86153846
0.85451409 0.86055777 0.86541738 0.84827586]
mean value: 0.8596801665597225
key: test_precision
value: [0.67567568 0.71111111 0.70731707 0.72857143 0.6746988 0.65333333
0.67164179 0.68965517 0.64383562 0.74666667]
mean value: 0.6902506663606596
key: train_precision
value: [0.95734003 0.95679797 0.96070976 0.95431472 0.96197719 0.96060991
0.96243523 0.95696203 0.95969773 0.95472186]
mean value: 0.9585566423771195
key: test_recall
value: [0.46296296 0.59813084 0.54205607 0.47663551 0.52336449 0.45794393
0.41666667 0.55555556 0.43518519 0.51851852]
mean value: 0.49870197300103836
key: train_recall
value: [0.78903826 0.77789256 0.78305785 0.7768595 0.78409091 0.78099174
0.76835574 0.78179938 0.78800414 0.76318511]
mean value: 0.7793275188663926
key: test_accuracy
value: [0.74691358 0.78637771 0.77399381 0.76780186 0.75851393 0.73993808
0.73684211 0.76780186 0.73065015 0.78018576]
mean value: 0.7589018843404809
key: train_accuracy
value: [0.91812865 0.91437414 0.91712517 0.9133425 0.91781293 0.91643741
0.91299862 0.91574966 0.91850069 0.90921596]
mean value: 0.9153685738877225
key: test_roc_auc
value: [0.67592593 0.73888024 0.71547248 0.69433628 0.69918224 0.66878678
0.65717054 0.71498708 0.65712748 0.71507321]
mean value: 0.6936942250879439
key: train_roc_auc
value: [0.88575624 0.88018339 0.88353923 0.8791514 0.8843135 0.88250618
0.87670749 0.88214132 0.88575889 0.87257658]
mean value: 0.8812634237575695
key: test_jcc
value: [0.37878788 0.48120301 0.44274809 0.4047619 0.41791045 0.36842105
0.34615385 0.44444444 0.35074627 0.44094488]
mean value: 0.4076121824209178
key: train_jcc
value: [0.76223776 0.75149701 0.75875876 0.74900398 0.76052104 0.75675676
0.74598394 0.75524476 0.76276276 0.73652695]
mean value: 0.753929370974749
MCC on Blind test: -0.01
Accuracy on Blind test: 0.66
Running classifier: 20
Model_name: Ridge Classifier
Model func: RidgeClassifier(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', RidgeClassifier(random_state=42))])
key: fit_time
value: [0.05962729 0.04202151 0.03843474 0.0427978 0.03905845 0.04888272
0.03913784 0.03857589 0.04385161 0.04807472]
mean value: 0.0440462589263916
key: score_time
value: [0.02047539 0.02058101 0.0210855 0.03549218 0.02058411 0.02569771
0.02990842 0.03035092 0.02592182 0.02603078]
mean value: 0.025612783432006837
key: test_mcc
value: [0.42197441 0.50227121 0.49229002 0.37294957 0.43266022 0.34283746
0.38894343 0.42591183 0.38120285 0.41589376]
mean value: 0.41769347633653114
key: train_mcc
value: [0.47218937 0.4697388 0.46883024 0.4831687 0.4783516 0.48723171
0.47936313 0.47695521 0.47892274 0.46392027]
mean value: 0.4758671757780622
key: test_fscore
value: [0.59183673 0.66019417 0.65714286 0.52808989 0.59793814 0.52406417
0.55135135 0.59067358 0.55958549 0.57894737]
mean value: 0.5839823756817275
key: train_fscore
value: [0.62188767 0.61948956 0.61868833 0.63097294 0.62615741 0.63364055
0.62781304 0.62629758 0.6283084 0.61421911]
mean value: 0.6247474595699097
key: test_precision
value: [0.65909091 0.68686869 0.66990291 0.66197183 0.66666667 0.6125
0.66233766 0.67058824 0.63529412 0.67073171]
mean value: 0.659595272882945
key: train_precision
value: [0.70657895 0.70634921 0.70596026 0.71261378 0.71184211 0.71614583
0.71018277 0.70795306 0.70817121 0.70360481]
mean value: 0.7089401985493536
key: test_recall
value: [0.53703704 0.63551402 0.64485981 0.43925234 0.54205607 0.45794393
0.47222222 0.52777778 0.5 0.50925926]
mean value: 0.5265922464520596
key: train_recall
value: [0.55532575 0.55165289 0.55061983 0.5661157 0.5588843 0.56818182
0.56256463 0.56153051 0.56463289 0.54498449]
mean value: 0.5584492808122591
key: test_accuracy
value: [0.75308642 0.78328173 0.77708978 0.73993808 0.75851393 0.7244582
0.74303406 0.75541796 0.73684211 0.75232198]
mean value: 0.7523984252570423
key: train_accuracy
value: [0.7753698 0.77441541 0.77407153 0.77957359 0.7778542 0.78129298
0.77819807 0.77716644 0.7778542 0.77235213]
mean value: 0.7768148338994856
key: test_roc_auc
value: [0.69907407 0.74599775 0.7437262 0.66407061 0.70389841 0.6572127
0.67564599 0.69877261 0.67790698 0.69183893]
mean value: 0.6958144264129377
key: train_roc_auc
value: [0.72018865 0.71860995 0.71809342 0.72609909 0.72299885 0.72790534
0.7240953 0.72306304 0.72435663 0.71530523]
mean value: 0.7220715494815876
key: test_jcc
value: [0.42028986 0.49275362 0.4893617 0.35877863 0.42647059 0.35507246
0.38059701 0.41911765 0.38848921 0.40740741]
mean value: 0.41383381363708355
key: train_jcc
value: [0.4512605 0.4487395 0.44789916 0.46089151 0.45577085 0.46374368
0.45752733 0.4559194 0.45805369 0.4432296 ]
mean value: 0.45430352175828553
MCC on Blind test: -0.01
Accuracy on Blind test: 0.82
Running classifier: 21
Model_name: Ridge ClassifierCV
Model func: RidgeClassifierCV(cv=3)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', RidgeClassifierCV(cv=3))])
key: fit_time
value: [0.2722826 0.22084355 0.20384145 0.28226876 0.19954824 0.20400214
0.1897254 0.1892333 0.19360566 0.19148731]
mean value: 0.2146838426589966
key: score_time
value: [0.01725984 0.02047038 0.02116036 0.02048469 0.02041197 0.02045321
0.02044177 0.02048683 0.02418494 0.02039576]
mean value: 0.02057497501373291
key: test_mcc
value: [0.40929374 0.49615965 0.49229002 0.41208021 0.40300521 0.39898913
0.3758128 0.39172684 0.37918242 0.43641862]
mean value: 0.4194958643813836
key: train_mcc
value: [0.45882857 0.44994518 0.45427321 0.45943451 0.45959564 0.46399503
0.45907228 0.45639837 0.47166873 0.44948321]
mean value: 0.4582694727184512
key: test_fscore
value: [0.58585859 0.65700483 0.65714286 0.55367232 0.57731959 0.56842105
0.52808989 0.57142857 0.55497382 0.58823529]
mean value: 0.5842146805740139
key: train_fscore
value: [0.6140553 0.60602549 0.60768335 0.61494253 0.61207898 0.61565217
0.61117579 0.61066049 0.62196532 0.60361938]
mean value: 0.6117858796917831
key: test_precision
value: [0.64444444 0.68 0.66990291 0.7 0.64367816 0.65060241
0.67142857 0.63636364 0.63855422 0.69620253]
mean value: 0.6631176883929146
key: train_precision
value: [0.69310793 0.68997361 0.696 0.69300518 0.69893899 0.7014531
0.69906791 0.69433465 0.7051114 0.69302949]
mean value: 0.6964022278190525
key: test_recall
value: [0.53703704 0.63551402 0.64485981 0.45794393 0.52336449 0.5046729
0.43518519 0.51851852 0.49074074 0.50925926]
mean value: 0.5257095880927656
key: train_recall
value: [0.55118925 0.54028926 0.5392562 0.55268595 0.54442149 0.54855372
0.54291624 0.54498449 0.55635988 0.53464323]
mean value: 0.5455299682924953
key: test_accuracy
value: [0.74691358 0.78018576 0.77708978 0.75541796 0.74613003 0.74613003
0.73993808 0.73993808 0.73684211 0.76160991]
mean value: 0.7530195313993042
key: train_accuracy
value: [0.76952184 0.76616231 0.76822558 0.7696011 0.77028886 0.77200825
0.77028886 0.76891334 0.77510316 0.76650619]
mean value: 0.7696619505448359
key: test_roc_auc
value: [0.69444444 0.74368294 0.7437262 0.68036085 0.68992298 0.68520682
0.66410422 0.68484065 0.67560293 0.69881568]
mean value: 0.6960707716518952
key: train_roc_auc
value: [0.71476988 0.70957762 0.71086521 0.7152605 0.71370559 0.71602944
0.7132407 0.71272924 0.72022012 0.7083314 ]
mean value: 0.7134729702518261
key: test_jcc
value: [0.41428571 0.48920863 0.4893617 0.3828125 0.4057971 0.39705882
0.35877863 0.4 0.38405797 0.41666667]
mean value: 0.4138027738120944
key: train_jcc
value: [0.44305902 0.43474647 0.43645485 0.4439834 0.44100418 0.44472362
0.44006706 0.43953294 0.45134228 0.43227425]
mean value: 0.4407188071791433
MCC on Blind test: -0.03
Accuracy on Blind test: 0.74
Running classifier: 22
Model_name: SVC
Model func: SVC(random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', SVC(random_state=42))])
key: fit_time
value: [0.4675827 0.40158606 0.46370888 0.44388795 0.45260119 0.46981597
0.47891617 0.46896625 0.45464277 0.46869564]
mean value: 0.45704035758972167
key: score_time
value: [0.10840988 0.10832095 0.111202 0.11126161 0.10859156 0.11575794
0.10987806 0.10994101 0.11417532 0.10804772]
mean value: 0.1105586051940918
key: test_mcc
value: [0.34831641 0.45406128 0.43917931 0.38397507 0.33629869 0.39030805
0.34809347 0.40289652 0.31908839 0.38855718]
mean value: 0.38107743600878463
key: train_mcc
value: [0.48971449 0.47710193 0.48490935 0.48886646 0.48778682 0.49328573
0.48281574 0.48601248 0.47289443 0.47735162]
mean value: 0.48407390571654874
key: test_fscore
value: [0.51648352 0.61139896 0.60103627 0.51764706 0.50828729 0.54444444
0.49710983 0.55737705 0.48587571 0.52325581]
mean value: 0.5362915941667892
key: train_fscore
value: [0.62440191 0.60049938 0.61226994 0.62222222 0.62048193 0.62173649
0.61538462 0.61641337 0.61000603 0.59786029]
mean value: 0.6141276175290785
key: test_precision
value: [0.63513514 0.68604651 0.6744186 0.6984127 0.62162162 0.67123288
0.66153846 0.68 0.62318841 0.703125 ]
mean value: 0.6654719315496417
key: train_precision
value: [0.74042553 0.75867508 0.75377644 0.74318508 0.74421965 0.75405007
0.74269006 0.74778761 0.73121387 0.76366559]
mean value: 0.7479688988337747
key: test_recall
value: [0.43518519 0.55140187 0.54205607 0.41121495 0.42990654 0.45794393
0.39814815 0.47222222 0.39814815 0.41666667]
mean value: 0.4512893734856352
key: train_recall
value: [0.53981386 0.49690083 0.51549587 0.53512397 0.53202479 0.52892562
0.52533609 0.52430196 0.52326784 0.49120993]
mean value: 0.5212400753801055
key: test_accuracy
value: [0.72839506 0.76780186 0.76160991 0.74613003 0.7244582 0.74613003
0.73065015 0.74922601 0.71826625 0.74613003]
mean value: 0.7418797538508581
key: train_accuracy
value: [0.78396973 0.77991747 0.7826685 0.78370014 0.78335626 0.78576341
0.78163686 0.78301238 0.77751032 0.78026135]
mean value: 0.7821796413249051
key: test_roc_auc
value: [0.65509259 0.71320093 0.70621322 0.661626 0.65013846 0.67341641
0.64791128 0.68029716 0.63860896 0.66414729]
mean value: 0.6690652293784763
key: train_roc_auc
value: [0.72274198 0.70901742 0.71573762 0.72142796 0.72039384 0.72142157
0.71733059 0.71810152 0.71372047 0.70773788]
mean value: 0.7167630875580664
key: test_jcc
value: [0.34814815 0.44029851 0.42962963 0.34920635 0.34074074 0.3740458
0.33076923 0.38636364 0.32089552 0.35433071]
mean value: 0.36744282748966156
key: train_jcc
value: [0.45391304 0.42908118 0.44120248 0.4516129 0.44978166 0.45110132
0.44444444 0.44551845 0.43885516 0.42639138]
mean value: 0.4431902021612174
MCC on Blind test: -0.01
Accuracy on Blind test: 0.65
Running classifier: 23
Model_name: Stochastic GDescent
Model func: SGDClassifier(n_jobs=10, random_state=42)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', 'KOSJ950100_SST',
'volumetric_rr'],
dtype='object', length=165)),
('cat', OneHotEncoder(),
Index(['electrostatics_change', 'water_change', 'aa_prop_change',
'active_site', 'polarity_change', 'ss_class'],
dtype='object'))])),
('model', SGDClassifier(n_jobs=10, random_state=42))])
key: fit_time
value: [0.07236099 0.1272366 0.13301539 0.11001801 0.10668135 0.10639429
0.08803391 0.12272954 0.10177016 0.08205581]
mean value: 0.10502960681915283
key: score_time
value: [0.01160359 0.01106191 0.01114011 0.0111897 0.01124978 0.01107454
0.01175976 0.01118541 0.01129556 0.01160216]
mean value: 0.011316251754760743
key: test_mcc
value: [0.34422369 0.38397507 0.40959626 0.38978764 0.25748688 0.29586397
0.263153 0.35606372 0.31210067 0.35919771]
mean value: 0.3371448599321918
key: train_mcc
value: [0.46513384 0.41835043 0.38179125 0.45727538 0.25162437 0.46942356
0.37811835 0.44853263 0.45746773 0.38629912]
mean value: 0.41140166429632724
key: test_fscore
value: [0.5787234 0.51764706 0.48366013 0.49689441 0.2 0.46242775
0.37086093 0.54822335 0.48314607 0.44444444]
mean value: 0.45860275386667304
key: train_fscore
value: [0.65240642 0.50141643 0.42703533 0.58947368 0.21821461 0.59714463
0.47932726 0.61391695 0.59915612 0.47909284]
mean value: 0.5157184267933566
key: test_precision
value: [0.53543307 0.6984127 0.80434783 0.74074074 0.92307692 0.60606061
0.65116279 0.60674157 0.61428571 0.75555556]
mean value: 0.6935817498816719
key: train_precision
value: [0.61559633 0.7972973 0.83233533 0.73570325 0.85815603 0.74805599
0.74347826 0.67116564 0.71820809 0.76126126]
mean value: 0.7481257477378727
key: test_recall
value: [0.62962963 0.41121495 0.34579439 0.37383178 0.11214953 0.37383178
0.25925926 0.5 0.39814815 0.31481481]
mean value: 0.37186742817583934
key: train_recall
value: [0.69389866 0.36570248 0.28719008 0.49173554 0.125 0.49690083
0.35367115 0.56566701 0.5139607 0.34953464]
mean value: 0.4243261086943516
key: test_accuracy
value: [0.69444444 0.74613003 0.75541796 0.74922601 0.70278638 0.7120743
0.70588235 0.7244582 0.71517028 0.73684211]
mean value: 0.7242432060543516
key: train_accuracy
value: [0.75404197 0.75790922 0.7434663 0.77200825 0.70185695 0.77682256
0.74449794 0.76341128 0.7713205 0.74724897]
mean value: 0.7532583920896722
key: test_roc_auc
value: [0.67824074 0.661626 0.65206386 0.65450848 0.55375995 0.6267307
0.59474591 0.66860465 0.63628338 0.63182601]
mean value: 0.6358389681792203
key: train_roc_auc
value: [0.73895964 0.65965536 0.62916205 0.7017956 0.55734536 0.70669784
0.64643887 0.71379693 0.70674851 0.64746181]
mean value: 0.6708061971633456
key: test_jcc
value: [0.40718563 0.34920635 0.31896552 0.33057851 0.11111111 0.30075188
0.22764228 0.37762238 0.31851852 0.28571429]
mean value: 0.30272964566752425
key: train_jcc
value: [0.48412698 0.33459357 0.27148438 0.41791045 0.12246964 0.42566372
0.31520737 0.44291498 0.42771084 0.31500466]
mean value: 0.35570865883434094
MCC on Blind test: -0.0
Accuracy on Blind test: 0.66
Running classifier: 24
Model_name: XGBoost
Model func: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
from pandas import MultiIndex, Int64Index
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
from pandas import MultiIndex, Int64Index
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
from pandas import MultiIndex, Int64Index
/home/tanu/git/LSHTM_analysis/scripts/ml/ml_functions/MultClfs_logo_skf.py:419: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
scoresDF_CV['source_data'] = 'CV'
/home/tanu/git/LSHTM_analysis/scripts/ml/ml_functions/MultClfs_logo_skf.py:446: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
scoresDF_BT['source_data'] = 'BT'
XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
colsample_bynode=None, colsample_bytree=None,
enable_categorical=False, gamma=None, gpu_id=None,
importance_type=None, interaction_constraints=None,
learning_rate=None, max_delta_step=None, max_depth=None,
min_child_weight=None, missing=nan, monotone_constraints=None,
n_estimators=100, n_jobs=None, num_parallel_tree=None,
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
scale_pos_weight=None, subsample=None, tree_method=None,
use_label_encoder=False, validate_parameters=None, verbosity=0)
Running model pipeline: Pipeline(steps=[('prep',
ColumnTransformer(remainder='passthrough',
transformers=[('num', MinMaxScaler(),
Index(['KOLA920101', 'MIYS930101', 'MUET020101', 'KESO980102', 'disulfide_ss',
'NGPC000101', 'MIYS960101', 'KANM000101', 'DOSZ010104', 'DOSZ010101',
...
'KAPO950101', 'electro_sm', 'OGAK980101', 'MOOG990101', 'snap2_score',
'SKOJ970101', 'DAYM780302', 'BENS940103', '...
interaction_constraints=None, learning_rate=None,
max_delta_step=None, max_depth=None,
min_child_weight=None, missing=nan,
monotone_constraints=None, n_estimators=100,
n_jobs=None, num_parallel_tree=None,
predictor=None, random_state=42, reg_alpha=None,
reg_lambda=None, scale_pos_weight=None,
subsample=None, tree_method=None,
use_label_encoder=False,
validate_parameters=None, verbosity=0))])
key: fit_time
value: [0.4045248 0.39586329 0.54533625 0.38850045 0.40007329 0.4009192
0.39819217 0.53889465 0.40999818 0.40365672]
mean value: 0.4285959005355835
key: score_time
value: [0.0127182 0.01287961 0.01206136 0.01228976 0.01205468 0.01234174
0.01220226 0.01323819 0.01356483 0.01230931]
mean value: 0.012565994262695312
key: test_mcc
value: [0.43237357 0.553524 0.44073036 0.42070123 0.38897269 0.41589901
0.37785757 0.42205585 0.38966166 0.50734459]
mean value: 0.4349120541372022
key: train_mcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_fscore
value: [0.61165049 0.68686869 0.61386139 0.58201058 0.57711443 0.58333333
0.57142857 0.60952381 0.56701031 0.64583333]
mean value: 0.604863492521287
key: train_fscore
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_precision
value: [0.64285714 0.74725275 0.65263158 0.67073171 0.61702128 0.65882353
0.61052632 0.62745098 0.63953488 0.73809524]
mean value: 0.660492540037964
key: train_precision
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_recall
value: [0.58333333 0.63551402 0.57943925 0.51401869 0.54205607 0.52336449
0.53703704 0.59259259 0.50925926 0.57407407]
mean value: 0.5590688819660782
key: train_recall
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_accuracy
value: [0.75308642 0.80804954 0.75851393 0.75541796 0.73684211 0.75232198
0.73065015 0.74613003 0.73993808 0.78947368]
mean value: 0.7570423881053395
key: train_accuracy
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_roc_auc
value: [0.71064815 0.76451627 0.71333074 0.69450935 0.6876947 0.69455261
0.68247201 0.7079242 0.68253661 0.73587425]
mean value: 0.7074058880114629
key: train_roc_auc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
key: test_jcc
value: [0.44055944 0.52307692 0.44285714 0.41044776 0.40559441 0.41176471
0.4 0.43835616 0.39568345 0.47692308]
mean value: 0.43452630737083436
key: train_jcc
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
mean value: 1.0
MCC on Blind test: -0.01
Accuracy on Blind test: 0.6
Extracting tts_split_name: logo_skf_BT_gid
Total cols in each df:
CV df: 8
metaDF: 17
Adding column: Model_name
Total cols in bts df:
BT_df: 8
First proceeding to rowbind CV and BT dfs:
Final output should have: 25 columns
Combinig 2 using pd.concat by row ~ rowbind
Checking Dims of df to combine:
Dim of CV: (24, 8)
Dim of BT: (24, 8)
8
Number of Common columns: 8
These are: ['Precision', 'Accuracy', 'source_data', 'F1', 'Recall', 'MCC', 'ROC_AUC', 'JCC']
Concatenating dfs with different resampling methods [WF]:
Split type: logo_skf_BT_gid
No. of dfs combining: 2
PASS: 2 dfs successfully combined
nrows in combined_df_wf: 48
ncols in combined_df_wf: 8
PASS: proceeding to merge metadata with CV and BT dfs
Adding column: Model_name
=========================================================
SUCCESS: Ran multiple classifiers
=======================================================