Update paper, example and AE

tgsmith61591 · Jun 22, 2017 · b1338ff · b1338ff
1 parent 5c15e37
commit b1338ff
Show file tree

Hide file tree

Showing 3 changed files with 52 additions and 51 deletions.
diff --git a/doc/smrt.tex b/doc/smrt.tex
@@ -53,7 +53,7 @@ \section{Introduction}
 
 A dataset may be considered imbalanced if its classification labels are disproportionately represented across classes. While class imbalance is very common and manifests itself in varying degrees, of particular interest are the cases in which one class---the majority class---is significantly more present than one or more minority classes, which are represented at a much smaller ratio. This can detrimentally impact a learning algorithm's ability to estimate a generalizable decision boundary. Consider, for instance, a medical test to determine whether a patient suffers a rare disease. The dataset may be 99.8\% composed of negative observations with only 0.2\% positive examples. Even the most na\"ive classifier can achieve 99.8\% classification accuracy in this case by simply learning to always predict the negative class \citep{lewis1994heterogeneous}. However, the utility of this test is completely absent, since it will never accurately predict the condition of value. This is by no means an isolated case, either; many machine learning domains---such as fraud detection, network security, spam filtration---frequently face some level of class imbalance. This paper presents some of the pitfalls of training machine learning models on such dataset, and presents a remedying class-balancing technique.
 
-Throughout this paper, we focus on inducing classification algorithms on a given training set, $X \in \mathbb{R}^{m \times n}$, with a corresponding set of class labels, $y \in \{0, 1, ..., c\}$ in which one or more of the minority class labels is/are represented at a significantly smaller proportion than that of one or more majority class labels. As noted by countless studies and the aforementioned medical test example, classifier efficacy often cannot meaningfully be expressed or assessed via conventional, cost-insensitive metrics such as accuracy (or the percentage of testing observations properly identified by the learner). This greatly complicates the classification task since such metrics will offer misleadingly optimistic scores on an otherwise ineffective classifier. Therefore, what makes class imbalance a particularly interesting and relevant problem is the frequent tangible cost with which misclassification of rare events is typically associated. The real-world impact of such errors can be especially perilous in the medical domain, where diagnostic datasets are especially susceptible to class disparity, as high risk examples (e.g., instances of rare diseases) tend to constitute the minority class \citep{rahman2013addressing}.
+Throughout this paper, we focus on inducing classification algorithms on a given training set, $X \in \mathbb{R}^{m \times n}$, with a corresponding set of class labels, $y \in \{0, 1, ..., c\}$ in which one or more of the minority class labels is/are represented at a significantly smaller proportion than that of one or more majority class labels. As noted by countless studies and the aforementioned medical test example, classifier efficacy often cannot meaningfully be measured via conventional, cost-insensitive metrics such as accuracy (or the percentage of testing observations properly identified by the learner). This greatly complicates the classification task since such metrics will offer misleadingly optimistic scores on an otherwise ineffective classifier. Therefore, what makes class imbalance a particularly interesting and relevant problem is the frequent tangible cost with which misclassification of rare events is typically associated. The real-world impact of such errors can be especially perilous in the medical domain, where diagnostic datasets are especially susceptible to class disparity, as high risk examples of interest (e.g., instances of rare diseases) tend to constitute the minority class \citep{rahman2013addressing}.
 
 Section 2 presents previous work to which our approach may be compared. Section 3 introduces generative models, and more specifically, variational auto-encoders. Section 4 outlines the details of our technique. Section 5 details the specifics of our experiments and the performance of our technique compared with other common class imbalance solutions. \\
 

diff --git a/examples/MNIST example.ipynb b/examples/MNIST example.ipynb
diff --git a/smrt/autoencode/autoencoder.py b/smrt/autoencode/autoencoder.py
@@ -137,7 +137,7 @@ class AutoEncoder(BaseAutoEncoder):
 
     learning_function : str, optional (default='rms_prop')
         The optimizing function for training. Default is ``'rms_prop'``, which will use
-        the ``tf.train.RMSPropOptimizer``. Can be one of { ``'adadelta'``, ``'adagrad'``,
+        the ``tf.train.RMSPropOptimizer``. Can be one of {``'adadelta'``, ``'adagrad'``,
         ``'adagrad-da'``, ``'adam'``, ``'momentum'``, ``'proximal-sgd'``, ``'proximal-adagrad'``,
         ``'rms_prop'``, ``'sgd'``}
 
@@ -334,7 +334,7 @@ class VariationalAutoEncoder(BaseAutoEncoder):
 
     learning_function : str, optional (default='rms_prop')
         The optimizing function for training. Default is ``'rms_prop'``, which will use
-        the ``tf.train.RMSPropOptimizer``. Can be one of { ``'adadelta'``, ``'adagrad'``,
+        the ``tf.train.RMSPropOptimizer``. Can be one of {``'adadelta'``, ``'adagrad'``,
         ``'adagrad-da'``, ``'adam'``, ``'momentum'``, ``'proximal-sgd'``, ``'proximal-adagrad'``,
         ``'rms_prop'``, ``'sgd'``}