Introduction to Deep Learning in PyTorch

Gustavo Albuquerque Lima
6 min readNov 5, 2021

--

If you are getting started with data science and deep learning, then this article is for you

Prerequisites

You just need to know the following:

Summary:

1- Installation
2- NumPy basics
3- PyTorch basics
4- First model
5- Conclusions
6- What should I learn now?
7- References

1. Installation

# Linux / Binder
# !pip install numpy torch==1.7.0+cpu torchvision==0.8.1+cpu torchaudio==0.7.0 -f https://download.pytorch.org/whl/torch_stable.html

# Windows
# !pip install numpy torch==1.7.0+cpu torchvision==0.8.1+cpu torchaudio==0.7.0 -f https://download.pytorch.org/whl/torch_stable.html

# MacOS
# !pip install numpy torch torchvision torchaudio

2. NumPy basics

W3schools introduction to NumPy:

NumPy is a Python library used for working with arrays.

It also has functions for working in domain of linear algebra, fourier transform, and matrices.

NumPy was created in 2005 by Travis Oliphant. It is an open source project and you can use it freely.

NumPy stands for Numerical Python.

Below are some basics

For more info

https://numpy.org/doc/stable/user/quickstart.html
https://www.w3schools.com/python/numpy/numpy_intro.asp

3. PyTorch basics

PyTorch is a Python library for processing tensors. A tensor can be a number, vector, matrix, or any n-dimensional array.

In tensors, you have many properties like the size, the number of dimensions of the tensor and the type

Working with dimensions
Working with dimensions, size and the dtype

Now we are going to do some basics operations, like addition, subtraction, division, and multiplication

Basics operations. Using tensor.x_() you apply the function in the same tensor variable

And if we want to calculate the average?

Let’s do some more complex math than average and addition

Here is the math behind z.backward()

For more complex equations, need to use the chain rule

And here is the math behind this equation

In PyTorch .backwards() is more used to calculate the gradients of the loss in a model, so is more common to calculate derivatives of matrices. To calculate the matrix need to use the Jacobian Matrix, but I will not enter this details, if you want to know more about this math I recommend you make some searches.

4. First Model

Now let’s start our first.

First: We need to work with some data, so let’s create a simple dataset.

Second: Before create the model from scratch, let’s understand what is a Deep Learning model.

A model basically is a mathematic function that applies some operations on an input and return a value.

A Deep Learning model have parameters and layers. And Deep Learning have this name because a model can have multiple layers.

Parameters are variables that are used in mathematic operations to modify the input, and from the computation loss, the model update the parameters to return a better value. A model have two types of parameters, weights, and biases. When a model initializes, it weights and biases are random values between -1 and 1

A linear layer is one of the basics things of a deep learning model.

Image from: https://joshuagoings.com/2020/05/05/neural-network/

The equation of the linear layer basically is the linear layer transposed weight multiplied by the input and sum by a bias.

Now let’s create our model.

The weight's matrix size is target_values X input_values and the bias matrix size is the targets size

Third: Now that we have the data and the model, let’s define the optimizer and the loss function.

The loss function is used to calculate the difference between the model predictions and the actual target.

In our case, we will use the Mean square error to calculate the difference.

But what we will do with the model loss? We will use the loss to calculate the gradient of the loss by every parameter of the model to optimize the model.

With the gradients we can adjust the parameters to optimize the model, because imagine the loss graph, and calculating the derivative of the loss by the parameters, we get big is the difference, so, if the derivative is not 0, so the loss can be adjusted to be smaller. Lets took an example and analyze a loss graph.

If the gradient is positive:

  • increasing the weight element’s value slightly will increase the loss
  • decreasing the weight element’s value slightly will decrease the loss
Image from: https://jovian.ai/aakashns/02-linear-regression

If a gradient element is negative:

  • increasing the weight element’s value slightly will decrease the loss
  • decreasing the weight element’s value slightly will increase the loss
Image from: https://jovian.ai/aakashns/02-linear-regression

So in our case, we will use this method to slightly subtract the parameters to optimize the model (descending along the gradient).

Fourth: So finally, let's train our model.

The training is divided in 5 steps:

  1. Make predictions
  2. Calculate the loss
  3. Calculate the gradients
  4. Adjust parameters
  5. Reset grads

And is basically these 5 steps repeating for a couple of times.

Now when we check the loss, the model is almost perfect

5. Conclusions

The outputs are exactly like the targets. This is because our dataset is very simple, and it is just a basic linear prediction.

Let’s see our data throw a graph

A sample of our dataset

This is the line that our Deep Learning model should discover. And in our case, our model discovered this exactly line.

Using the real-world data, it is impossible to create a perfect deep learning model with 100% of accuracy because the data rarely will be linear like in our model.

In machine learning you need to be very careful while selecting the hyperparameters like the learning rate (how much of the grad you will use to subtract to the parameters)

But now, let’s evaluate our model with unseen data to see if it is really working.

Now that our model is working very well, you can feel free to create your own dataset and change the learning rate and the training epochs.

Here are the full code:

6. What should I learn now?

In this article I don’t teach you deep about the .backward, so I recommend you to watch this video:

I didn’t make this video, but is one of the bests I found on the internet.

I recommend you also to check the full playlist.

After watching this video, I recommend you to learn how to do the same thing we did, but using PyTorch classes, then learning how to do a multiple layer model, convolutional layers, auto-encoders, GAN (Generative Adversarial Network) and VAE (Variational Auto-Encoders). Then you will have a good understanding about Deep Learning, and will be able to decide what the next path to take inside Deep Learning.

In a near future I will release another article about Deep Learning, teaching more about multiple layers model.

--

--

Gustavo Albuquerque Lima
Gustavo Albuquerque Lima

Written by Gustavo Albuquerque Lima

Studies: React Native, Python, JavaScript, Node JS, C#, SQL, React.Js, Bootstrap, php, Pytorch and Numpy