Let's go, Pytorch

Recently, I am preparing for a ship detection competition and learning tensorflow at the same time. But I noticed that the code style of tensorflow is awful. I do not want to waste a lot of time to learn the tensorflow grammar even if I am a big Google fan. So I try to find some alternatives.

After reading a lot of introductions, I make a decision to learn Pytorch and use it as my major tools during my postgraduate period. Pytorch is so awesome.

I have tried to write a ConvAutoEncoder and I will introduce that how to make the autoencoder with Pytorch in this blog.

The first thing that you need to do is import the module:

1
2
3
4
5
6
7
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import torch
import torchvision
import torchvision.transforms as transforms
from torch.autograd import Variable

The second thing that you should do is define a class like this:

1
2
3
4
5
6
7
class ConvAutoEncoder(nn.Module):
def __init__(self):
super(ConvAutoEncoder, self).__init__()
self.conv_encoder = nn.Conv2d(3, 6, 5)
self.pool_encoder = nn.MaxPool2d(2, 2, return_indices=True)
self.pool_decoder = nn.MaxUnpool2d(2, 2)
self.conv_decoder = nn.ConvTranspose2d(6, 3, 5)

we define every layers of autoencoder in the initialization function. But you should call the initialization function of the superclass at the begining. In this example, I use one conv layer and one pooling layer in the encoder part, one unpooling layer and one deconv layer in the decoder part.

Then, you shoud define the foward function like this:

1
2
3
4
5
6
def forward(self, x):
output_conv = self.conv_encoder(F.relu(x))
h, ind = self.pool_encoder(output_conv)
output_unpool = self.pool_decoder(h, ind)
y = self.conv_decoder(F.relu(output_unpool))
return y, h

The forward function is very clear. You can just write it follow the real flow of the data. In this example, x is the input data, output_conv is the output of the convolution encoder. h is the hidden matrix also is the thing that we want to use. output_unpool is the output of the unpooling layer. The unpooling function need the indices parameter so you should set ‘return_indices=True’ as a parameter of the pooling layer initialization and receive the indices with the variable ind.

And then, you should write a main function to istantiate a autoencoder and train it with a loss function.

Pytorch provide a useful module called torchvision, we can use this module to import many datasets such as cifar10, mnist or coco. We use cifar10 as our training datasets. you can use this code to import the datasets:

1
2
3
4
5
6
7
8
transform = transforms.Compose(
[transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

train_set = torchvision.datasets.CIFAR10(root='./data', train=True,
download=True, transform=transform)
train_loader = torch.utils.data.DataLoader(train_set, batch_size=4,
shuffle=True, num_workers=2)

train_loader is a loader which can unpack the image and label from the cifar dataset.

I use L1loss as my loss function:

1
2
loss_func = nn.L1Loss()
optimizer = optim.SGD(auto_encoder.parameters(), lr=0.01, momentum=0.9)

optim module can make a auto optimizer to update the weight, bias or any parameters in your network.

The last thing that you should is write a interation to train your network:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
for epoch in range(2):

running_loss = 0.0
for i, data in enumerate(train_loader, 0):
inputs, labels = data

inputs, labels = Variable(inputs).cuda(), Variable(labels).cuda()

optimizer.zero_grad()

outputs, h = auto_encoder(inputs)
loss = loss_func(outputs, inputs)
loss.backward()
optimizer.step()

running_loss += loss.data[0]
if i % 2000 == 1999:
print('Epoch: %d, Iter: %5d] loss: %.3f' %
(epoch + 1, i + 1, running_loss / 2000))
running_loss = 0.0

you can call the zero_grad function to clear your optimizer, and just send the image to the autoencoder object, it will return the output. Then you can caculate the loss and call the backward function to update the weight and bias.

If you want to train your network on GPU, just call the cuda() function after the net object and the input object.

Now, run the code and you will get the loss output in the console.

If anything wrong, PLEASE tell me by e-mail or leave a message on this page.

All code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import torch
import torchvision
import torchvision.transforms as transforms
from torch.autograd import Variable


class ConvAutoEncoder(nn.Module):
def __init__(self):
super(ConvAutoEncoder, self).__init__()
self.conv_encoder = nn.Conv2d(3, 6, 5)
self.pool_encoder = nn.MaxPool2d(2, 2, return_indices=True)
self.pool_decoder = nn.MaxUnpool2d(2, 2)
self.conv_decoder = nn.ConvTranspose2d(6, 3, 5)

def forward(self, x):
x = self.conv_encoder(F.relu(x))
h, ind = self.pool_encoder(x)
x = self.pool_decoder(h, ind)
x = self.conv_decoder(F.relu(x))
return x, h


if __name__ == "__main__":
with torch.cuda.device(0):
auto_encoder = ConvAutoEncoder().cuda()
transform = transforms.Compose(
[transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

train_set = torchvision.datasets.CIFAR10(root='./data', train=True,
download=True, transform=transform)
train_loader = torch.utils.data.DataLoader(train_set, batch_size=4,
shuffle=True, num_workers=2)

loss_func = nn.L1Loss()
optimizer = optim.SGD(auto_encoder.parameters(), lr=0.01, momentum=0.9)
for epoch in range(2):


running_loss = 0.0
for i, data in enumerate(train_loader, 0):
inputs, labels = data

inputs, labels = Variable(inputs).cuda(), Variable(labels).cuda()

optimizer.zero_grad()

outputs, h = auto_encoder(inputs)
loss = loss_func(outputs, inputs)
loss.backward()
optimizer.step()

running_loss += loss.data[0]
if i % 2000 == 1999:
print('Epoch: %d, Iter: %5d] loss: %.3f' %
(epoch + 1, i + 1, running_loss / 2000))
running_loss = 0.0

print('Finished Training')