ML_AI_training

Author	SHA1	Message	Date
Tanushree Tunstall	5fe2dc47cd	added files and saving work	2022-05-05 19:44:19 +01:00
Tanushree Tunstall	409caaf0bc	added lineage and af count accounting for corrupt data	2022-04-08 17:00:57 +01:00
Tanushree Tunstall	28d0d68413	added distinct lin count for each mutation	2022-04-07 18:47:53 +01:00
Tanushree Tunstall	67d9e6160a	added logoplot_example.R and ga_customers.csv	2022-04-05 14:52:08 +01:00
Tanushree Tunstall	c647773520	saving work	2022-04-05 14:51:21 +01:00
Tanushree Tunstall	6a9d23ec8f	added sample test data for processing to get correct annotations	2022-03-24 17:42:02 +00:00
Tanushree Tunstall	005efb1e0e	updated NOTES to reflect importance of eg 5 in unsup_v1.py	2022-03-23 16:25:27 +00:00
Tanushree Tunstall	89a0c3a58a	added tutorial examples and my data workthrough examplesin unsup_v1.py	2022-03-23 16:23:18 +00:00
Tanushree Tunstall	ad5ebad7f8	renamed hyperparams to gscv	2022-03-22 11:08:20 +00:00
Tanushree Tunstall	a82358dbb4	renamed practice_cv2 to cross_validate_vs_loopity_loop	2022-03-22 11:03:51 +00:00
Tanushree Tunstall	0c4f1e1e5f	added all classification algorithms params for gridsearch	2022-03-21 13:51:20 +00:00
Tanushree Tunstall	d012542435	added NOTES to indicate which scripts are important	2022-03-18 17:56:26 +00:00
Tanushree Tunstall	ffd3ce6ee3	added intra_model_gscv.py that tell me within each model which hyperparasm are best, allows me to choose the models with the best hyperparams to then compare 'INTER' model	2022-03-18 17:52:06 +00:00
Tanushree Tunstall	d3b6fe13a6	added grid_search_vs_base_estimator.py to compare results from baseestimator and gridsearch manual	2022-03-18 17:51:38 +00:00
Tanushree Tunstall	b27bfa4a96	added names and links for classification algo	2022-03-18 17:50:49 +00:00
Tanushree Tunstall	824c2f041c	saving work	2022-03-18 17:50:24 +00:00
Tanushree Tunstall	ab1508e9fb	added testing_lazypredict that runs 30 ML models in one go	2022-03-17 18:20:50 +00:00
Tanushree Tunstall	de05652ef6	added scripts for playing base_estimator	2022-03-17 18:20:19 +00:00
Tanushree Tunstall	5138036d8b	playing with gridsearchCV and base estimator	2022-03-17 18:19:43 +00:00
Tanushree Tunstall	458a933d73	added proof of concept checks to make sure loopity loop is equivalent to cross_validate with stratified Kfold passed as a cv param	2022-03-17 18:18:43 +00:00
Tanushree Tunstall	d0c329a1d9	modified loopity and multclass3 to have skf_cv as a parameters for cv	2022-03-17 18:17:58 +00:00
Tanushree Tunstall	97620c1bb0	added practice and base_estimator for all the confusion in my head	2022-03-16 10:12:59 +00:00
Tanushree Tunstall	e28a296d98	saving work	2022-03-16 10:11:13 +00:00
Tanushree Tunstall	a1631ea54b	added loopity loop function call to extract mean values for each model's metric from nested dict (2/3 levels)	2022-03-14 18:46:59 +00:00
Tanushree Tunstall	29306e77ee	added exmaples and practice run for imbalanced data sets	2022-03-14 18:43:29 +00:00
Tanushree Tunstall	1016430ae0	added copy to imports	2022-03-14 18:43:02 +00:00
Tanushree Tunstall	160053d361	loopity_loop_CALL	2022-03-14 18:36:23 +00:00
Tanushree Tunstall	7aead2d4f4	added loopity_loop to run multiple models with stratified k-fold, got stuck in infinite loops and nested dicts	2022-03-14 10:36:19 +00:00
Tanushree Tunstall	69d0c1b557	dict	2022-03-10 19:20:02 +00:00
Tanushree Tunstall	d733b980ba	added MultClassPipe3.py that runs multiple classification models on stratified K-fold data	2022-03-09 18:36:47 +00:00
Tanushree Tunstall	1bfb35c30c	trying Stratified Kfold split on running multiple pipelines	2022-03-09 18:35:54 +00:00
Tanushree Tunstall	bb8f6f70ba	added prelim run for pnca all models with on-hot encoder multi model pipeline	2022-03-07 18:27:58 +00:00
Tanushree Tunstall	dd8fd5b8ac	added MultClassPipe2.py that has one hot encoder included	2022-03-07 18:27:29 +00:00
Tanushree Tunstall	b637ebc6d2	saving work	2022-03-07 18:27:07 +00:00
Tanushree Tunstall	564e72fc2d	added MultClassPipe2 that has one hot encoder step to the pipeline	2022-03-07 17:36:48 +00:00
Tanushree Tunstall	f5dcf29e25	added my_data9.py trying models with num and cat features	2022-03-07 15:32:34 +00:00
Tanushree Tunstall	3bf63c522c	trying one_hot encoder for categ vars, which was sucessful but not rfecv	2022-03-06 14:49:51 +00:00
Tanushree Tunstall	6160d943f5	made var names more meaniningful	2022-03-06 14:49:32 +00:00
Tanushree Tunstall	e2b997badf	trying feature selection for classification logistic algorithm on 3 types of target	2022-03-05 15:13:43 +00:00
Tanushree Tunstall	ec2d5ca25b	saving work	2022-03-05 15:13:26 +00:00
Tanushree Tunstall	877862acb7	added count for targets for all genes and ran multiple classification models for all of the genes and target as a start	2022-03-04 19:16:04 +00:00
Tanushree Tunstall	89158bc669	saving work	2022-03-04 19:15:49 +00:00
Tanushree Tunstall	51069fdb76	output merged_df3 and merged_df2 files for all gene-targtes along with active site residues annotated	2022-03-04 10:58:14 +00:00
Tanushree Tunstall	bff16fc219	added my_data5.py to run multiple classifications algorithms and added prelim results	2022-03-03 17:59:51 +00:00
Tanushree Tunstall	1fecbc15c9	added standard KFold as well	2022-03-03 15:18:34 +00:00
Tanushree Tunstall	04e0267dd1	added my_data4 after outputting merged_df3 for pnca to test the ml models	2022-03-03 13:35:05 +00:00
Tanushree Tunstall	25a55ac914	added practice scripts 2 and 3 to test different methods	2022-03-02 19:42:51 +00:00
Tanushree Tunstall	9d46613ca4	updated practice script with some notes	2022-02-24 18:41:15 +00:00
Tanushree Tunstall	67e003df8b	added my data ML test	2022-02-24 18:34:07 +00:00
Tanushree Tunstall	8edd4c5b6d	added practicals and solutions for TF	2020-01-28 08:49:52 +00:00

1 2

52 commits