Skip to content

Commit

Permalink
accomodate feedback
Browse files Browse the repository at this point in the history
  • Loading branch information
AlbertDominguez committed Aug 20, 2024
1 parent 362b625 commit 3890cff
Show file tree
Hide file tree
Showing 3 changed files with 14 additions and 22 deletions.
6 changes: 1 addition & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,9 +28,5 @@ Please run the setup script to create the environment for this exercise.
source setup.sh
```

When you are ready to start the exercise run `jupyter lab`.
```bash
jupyter lab
```
You can now open the `exercise.ipynb` file in VSCode. Please make sure that the `Python` and `Jupyter` VSCode extensions are installed before proceeding with the exercise. When you are ready, simply follow the instructions in the notebook from the beginning.

...and continue with the instructions in the notebook.
2 changes: 1 addition & 1 deletion setup.sh
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
#!/usr/bin/env bash

# Create environment
conda create -y -n 01_intro_dl python=3.9
conda create -y -n 01_intro_dl python=3.11

# Activate environment
conda activate 01_intro_dl
Expand Down
28 changes: 12 additions & 16 deletions solution.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,8 @@
In particular, we will:
- Implement a perceptron and a 2-layer perceptron to compute the XOR function using NumPy.
- Introduce PyTorch, a popular framework for deep learning.
- Implement and train a simple neural network (a multi-layer perceptron) to classify points in a 2D plane using PyTorch.
- Implement and train a simple deep convolutional neural network to classify hand-written digits from the MNIST dataset using PyTorch.
- Implement and train a simple neural network (a multi-layer perceptron, or simply MLP) to classify points in a 2D plane using PyTorch.
- Implement and train a simple convolutional neural network to classify hand-written digits from the MNIST dataset using PyTorch.
- Discuss important topics in ML/DL, such as data splitting, under/overfitting and model generalization.
<div class="alert alert-block alert-danger">
Expand Down Expand Up @@ -56,7 +56,7 @@
* `x`: the input of the perceptron, a `numpy` array of shape `(n,)`
* `w`: the weights of the perceptron, a `numpy` array of shape `(n,)`
* `b`: a single scalar value for the bias
* `f`: a nonlinear function $f: \mathbb{R}\mapsto\mathbb{R}$
* `f`: a nonlinear function $f: \mathbb{R}\mapsto\left{0, 1\right}$
Test your perceptron function on 2D inputs (i.e., `n=2`) and plot the result. Change the weights, bias, and the function $f$ and see how the output of the perceptron changes.
"""
Expand All @@ -70,6 +70,10 @@ def non_linearity(a):

# %% tags=["solution"]
def non_linearity(a):
"""This non-linearity is called the step function.
NOTE: this function is not differentiable, and thus
is not cannot be used in gradient descent.
"""
return a > 0


Expand Down Expand Up @@ -116,9 +120,7 @@ def plot_perceptron(w, b, f):
<div class="alert alert-block alert-success">
<h2> Checkpoint 1 </h2>
You have implemented a perceptron using basic Python and NumPy functions, as well as checked what the perceptron decision boundary looks like.
We will now go over different ways to implement the perceptron together and discuss their efficiency. If you arrived here earlier, feel free to play around with the parameters of the perceptron (the weights and bias) as well as the activation function `f`.
Time: 20 working, + 10 discussion
We will now go over different ways to implement the perceptron together and discuss their efficiency. If you arrived here earlier, feel free to play around with the parameters of the perceptron (the weights and bias) as well as the activation function <code>f</code>.
</div>
"""

Expand Down Expand Up @@ -177,10 +179,9 @@ def plot_xor_data():
#### Hint
A single layer in a multilayer perceptron can be described by the equation $y = f(x^\intercal w + b)$ with $f$ a nonlinear function. $b$ is the so called bias (a constant offset vector) and $w$ a vector of weights. Since we are only interested in outputs of `0` or `1`, a good choice for $f$ is the threshold function. Think about which kind of logical operations you can implement with a single perceptron, then see how you can combine them to create an XOR. It might help to write down the equation for a two layer perceptron network.
A single layer in a multilayer perceptron can be described by the equation $y = f(x^\intercal w + b)$, where $f$ denotes a non-linear function, $b$ denotes the bias (a constant offset vector) and $w$ denotes a vector of weights. Since we are only interested in boolean outputs ($\left{0,1\right}$), a good choice for $f$ is the threshold function. Think about which kind of logical operations you can implement with a single perceptron, then see how you can combine them to create an XOR. It might help to write down the equation for a two layer perceptron network.
"""


# %% tags=["task"]
def xor(x):
"""
Expand Down Expand Up @@ -254,8 +255,6 @@ def test_xor():
<br/>
If you arrive here early, think about how to generalize the XOR function to an arbitrary number of inputs. For more than two inputs, the XOR returns True if the number of 1s in the inputs is odd, and False otherwise.
Time: 30 working + 15 min discussion
</div>
"""
# %% [markdown]
Expand Down Expand Up @@ -580,7 +579,6 @@ def forward(self, x):
# Update the progress bar to display the training loss
pbar.set_postfix({"training loss": curr_loss})

good_model.eval()
good_predictions = predict(good_model, X_test, y_test, batch_size, device)
good_accuracy = accuracy(good_predictions, y_test)

Expand Down Expand Up @@ -677,8 +675,6 @@ def plot_classifiers(classifier_1, classifier_2):
<h2> Checkpoint 3</h2>
You have now been introduced to PyTorch and trained a simple neural network on a binary classification problem. You have also seen how to visualize the decision function of the model, and what happens if the model is applied to a domain it had not seen during training.
Let us know in the exercise channel when you got here and what accuracy your model achieved! We will compare different solutions and discuss why some of them are better than others. We will also discuss the generalization behaviour of the classifier outside of the domain it was trained on.
Time: 60 working + 15 discussion
</div>
"""

Expand All @@ -700,7 +696,7 @@ def plot_classifiers(classifier_1, classifier_2):
However, the output of our network will be a 10-dimensional vector, indicating the probabilities for the input to be one of ten classes (corresponding to the digits 0 to 9). For that, we will use fully connected layers at the end of our network, once the dimensionality of a feature map is small enough to capture high-level information.
In principle, we could just use convolutional layers to reduce the size of each feature map by 2 until one feature map is small enough to allow using a fully connected layer. However, it is good practice to have a convolutional layer followed by a so-called downsampling layer, which effectively reduces the size of the feature map by the downsampling factor.
In principle, we could just use convolutional layers to reduce the size of each feature map by 2 until one feature map is small enough to allow using a fully connected layer. However, in many network architectures, you will find a convolutional layer followed by a so-called downsampling layer, which effectively reduces the size of the feature map by the downsampling factor. Whether a downsampling layer will be beneficial or not depends mostly on the specific problem to be dealt with.
"""


Expand All @@ -715,10 +711,10 @@ def plot_classifiers(classifier_1, classifier_2):
from torchvision import transforms

all_train_ds = MNIST(
root=".mnist", train=True, download=True, transform=transforms.ToTensor()
root="mnist_data", train=True, download=True, transform=transforms.ToTensor()
)
test_ds = MNIST(
root=".mnist", train=False, download=True, transform=transforms.ToTensor()
root="mnist_data", train=False, download=True, transform=transforms.ToTensor()
)

# %% [markdown]
Expand Down

0 comments on commit 3890cff

Please sign in to comment.