Skip to content

Latest commit

 

History

History
36 lines (33 loc) · 2.77 KB

ScriptsToStudyByDay.org

File metadata and controls

36 lines (33 loc) · 2.77 KB

Day 1: Loading Data, Normalization, Unsupervised Analysis

RPythonNotes
LoadData.RLoadData.py
NormalizeData.RNormalizedData.pyRLE- and mean-center-normalization
Clustering.RClustering.pyk-means and hierarchical clustering
PCA_intro.R
PCA.RPCA.py

Day 2: knn classification, overfitting, cross-validation, feature selection

RPythonNotes
KnnSim.RKnnSim.pycompare resub vs. test performance on simulated data
KnnSimCV.RKnnSimCV.pyshow cross-validation (cv) removes resub bias
BadFeatSel.RBadFeatSel.pysupervised feature selection must be done under cv
KnnGrid.RKnnGrid.pycompare cv acc for varying k parameter on real data
KnnReal.RKnnReal.pyt-test feature selection/extraction + knn on real data

Day 3: linear models, regularization, naive bayes

RPythonNotes
TTesting.RTTesting.py
PredictingGeneExpression.RPredictionGeneExpression.py
WhyRegularize.RWhyRegularize.py
LogisticReal.RLogisticReal.py
LdaIsLikeLogistic.R

Day 4: svm, bootstrap, trees, random forests, boosting

RPythonNotes
SvmReal.RSvmReal.py
bootstrap_examples.Rmostly taken from package bootstrap examples
KnnSimBoot.R
RandomForestReal.RRandomForestReal.py
AdaBoostReal.RAdaBoostReal.py
CompareModelStrats.RCompareModelStrats.py