Commit graph

52 commits

Author SHA1 Message Date
5fe2dc47cd added files and saving work 2022-05-05 19:44:19 +01:00
409caaf0bc added lineage and af count accounting for corrupt data 2022-04-08 17:00:57 +01:00
28d0d68413 added distinct lin count for each mutation 2022-04-07 18:47:53 +01:00
67d9e6160a added logoplot_example.R and ga_customers.csv 2022-04-05 14:52:08 +01:00
c647773520 saving work 2022-04-05 14:51:21 +01:00
6a9d23ec8f added sample test data for processing to get correct annotations 2022-03-24 17:42:02 +00:00
005efb1e0e updated NOTES to reflect importance of eg 5 in unsup_v1.py 2022-03-23 16:25:27 +00:00
89a0c3a58a added tutorial examples and my data workthrough examplesin unsup_v1.py 2022-03-23 16:23:18 +00:00
ad5ebad7f8 renamed hyperparams to gscv 2022-03-22 11:08:20 +00:00
a82358dbb4 renamed practice_cv2 to cross_validate_vs_loopity_loop 2022-03-22 11:03:51 +00:00
0c4f1e1e5f added all classification algorithms params for gridsearch 2022-03-21 13:51:20 +00:00
d012542435 added NOTES to indicate which scripts are important 2022-03-18 17:56:26 +00:00
ffd3ce6ee3 added intra_model_gscv.py that tell me within each model which hyperparasm are best, allows me to choose the models with the best hyperparams to then compare 'INTER' model 2022-03-18 17:52:06 +00:00
d3b6fe13a6 added grid_search_vs_base_estimator.py to compare results from baseestimator and gridsearch manual 2022-03-18 17:51:38 +00:00
b27bfa4a96 added names and links for classification algo 2022-03-18 17:50:49 +00:00
824c2f041c saving work 2022-03-18 17:50:24 +00:00
ab1508e9fb added testing_lazypredict that runs 30 ML models in one go 2022-03-17 18:20:50 +00:00
de05652ef6 added scripts for playing base_estimator 2022-03-17 18:20:19 +00:00
5138036d8b playing with gridsearchCV and base estimator 2022-03-17 18:19:43 +00:00
458a933d73 added proof of concept checks to make sure loopity loop is equivalent to cross_validate with stratified Kfold passed as a cv param 2022-03-17 18:18:43 +00:00
d0c329a1d9 modified loopity and multclass3 to have skf_cv as a parameters for cv 2022-03-17 18:17:58 +00:00
97620c1bb0 added practice and base_estimator for all the confusion in my head 2022-03-16 10:12:59 +00:00
e28a296d98 saving work 2022-03-16 10:11:13 +00:00
a1631ea54b added loopity loop function call to extract mean values for each model's metric from nested dict (2/3 levels) 2022-03-14 18:46:59 +00:00
29306e77ee added exmaples and practice run for imbalanced data sets 2022-03-14 18:43:29 +00:00
1016430ae0 added copy to imports 2022-03-14 18:43:02 +00:00
160053d361 loopity_loop_CALL 2022-03-14 18:36:23 +00:00
7aead2d4f4 added loopity_loop to run multiple models with stratified k-fold, got stuck in infinite loops and nested dicts 2022-03-14 10:36:19 +00:00
69d0c1b557 dict 2022-03-10 19:20:02 +00:00
d733b980ba added MultClassPipe3.py that runs multiple classification models on stratified K-fold data 2022-03-09 18:36:47 +00:00
1bfb35c30c trying Stratified Kfold split on running multiple pipelines 2022-03-09 18:35:54 +00:00
bb8f6f70ba added prelim run for pnca all models with on-hot encoder multi model pipeline 2022-03-07 18:27:58 +00:00
dd8fd5b8ac added MultClassPipe2.py that has one hot encoder included 2022-03-07 18:27:29 +00:00
b637ebc6d2 saving work 2022-03-07 18:27:07 +00:00
564e72fc2d added MultClassPipe2 that has one hot encoder step to the pipeline 2022-03-07 17:36:48 +00:00
f5dcf29e25 added my_data9.py trying models with num and cat features 2022-03-07 15:32:34 +00:00
3bf63c522c trying one_hot encoder for categ vars, which was sucessful but not rfecv 2022-03-06 14:49:51 +00:00
6160d943f5 made var names more meaniningful 2022-03-06 14:49:32 +00:00
e2b997badf trying feature selection for classification logistic algorithm on 3 types of target 2022-03-05 15:13:43 +00:00
ec2d5ca25b saving work 2022-03-05 15:13:26 +00:00
877862acb7 added count for targets for all genes and ran multiple classification models for all of the genes and target as a start 2022-03-04 19:16:04 +00:00
89158bc669 saving work 2022-03-04 19:15:49 +00:00
51069fdb76 output merged_df3 and merged_df2 files for all gene-targtes along with active site residues annotated 2022-03-04 10:58:14 +00:00
bff16fc219 added my_data5.py to run multiple classifications algorithms and added prelim results 2022-03-03 17:59:51 +00:00
1fecbc15c9 added standard KFold as well 2022-03-03 15:18:34 +00:00
04e0267dd1 added my_data4 after outputting merged_df3 for pnca to test the ml models 2022-03-03 13:35:05 +00:00
25a55ac914 added practice scripts 2 and 3 to test different methods 2022-03-02 19:42:51 +00:00
9d46613ca4 updated practice script with some notes 2022-02-24 18:41:15 +00:00
67e003df8b added my data ML test 2022-02-24 18:34:07 +00:00
8edd4c5b6d added practicals and solutions for TF 2020-01-28 08:49:52 +00:00