PytorchBasic
目录
autograd 包为张量上的所有操作提供了自动求导机制。它是一个在运行时定义(define-by-run)的框架,这意味着反向传播是根据代码如何运行来决定的,并且每次迭代可以是不同的.
1. 概念介绍
torch.Tensor 是这个包的核心类。如果设置它的属性 .requires_grad 为 True,那么autograd将会追踪对于该张量的所有操作。当完成计算后可以
通过调用 .backward(),来自动计算所有的梯度
。这个张量的所有梯度将会自动累加到.grad属性
.
为了
防止跟踪历史记录(和使用内存)
,可以将代码块包装在 with torch.no_grad(): 中
。在评估模型时特别有用,因为模型可能具有 requires_grad = True 的可训练的参数,但是我们不需要在此过程中对他们进行梯度计算。只需要定义
forward
函数,backward
函数会在使用autograd
时自动定义,backward
函数用来计算导数。我们可以在forward
函数中使用任何针对张量的操作和计算。
torch.Tensor
- 一个多维数组,支持诸如backward()
等的自动求导操作,同时也保存了张量的梯度。
nn.Module
- 神经网络模块。是一种方便封装参数的方式,具有将参数移动到GPU、导出、加载等功能。
nn.Parameter
- 张量的一种,当它作为一个属性分配给一个Module
时,它会被自动注册为一个参数。
autograd.Function
- 实现了自动求导前向和反向传播的定义,每个Tensor
至少创建一个Function
节点,该节点连接到创建Tensor
的函数并对其历史进行编码。
import torch
import torchvision
import torch.nn as nn
import numpy as np
import torchvision.transforms as transforms
# ================================================================== #
# Table of Contents #
# ================================================================== #
# 1. Basic autograd example 1 (Line 25 to 39)
# 2. Basic autograd example 2 (Line 46 to 83)
# 3. Loading data from numpy (Line 90 to 97)
# 4. Input pipline (Line 104 to 129)
# 5. Input pipline for custom dataset (Line 136 to 156)
# 6. Pretrained model (Line 163 to 176)
# 7. Save and load model (Line 183 to 189)
# ================================================================== #
# 1. Basic autograd example 1 #
# ================================================================== #
# Create tensors.
x = torch.tensor(1., requires_grad=True)
w = torch.tensor(2., requires_grad=True)
b = torch.tensor(3., requires_grad=True)
# Build a computational graph.
y = w * x + b # y = 2 * x + 3
# Compute gradients.
y.backward()
# Print out the gradients.
print(x.grad) # x.grad = 2
print(w.grad) # w.grad = 1
print(b.grad) # b.grad = 1
# ================================================================== #
# 2. Basic autograd example 2 #
# ================================================================== #
# Create tensors of shape (10, 3) and (10, 2).
x = torch.randn(10, 3)
y = torch.randn(10, 2)
# Build a fully connected layer.
linear = nn.Linear(3, 2)
print ('w: ', linear.weight)
print ('b: ', linear.bias)
# Build loss function and optimizer.
criterion = nn.MSELoss()
optimizer = torch.optim.SGD(linear.parameters(), lr=0.01)
# Forward pass.
pred = linear(x)
# Compute loss.
loss = criterion(pred, y)
print('loss: ', loss.item())
# Backward pass.
loss.backward()
# Print out the gradients.
print ('dL/dw: ', linear.weight.grad)
print ('dL/db: ', linear.bias.grad)
# 1-step gradient descent.
optimizer.step()
# You can also perform gradient descent at the low level.
# linear.weight.data.sub_(0.01 * linear.weight.grad.data)
# linear.bias.data.sub_(0.01 * linear.bias.grad.data)
# Print out the loss after 1-step gradient descent.
pred = linear(x)
loss = criterion(pred, y)
print('loss after 1 step optimization: ', loss.item())
# ================================================================== #
# 3. Loading data from numpy #
# ================================================================== #
# Create a numpy array.
x = np.array([[1, 2], [3, 4]])
# Convert the numpy array to a torch tensor.
y = torch.from_numpy(x)
# Convert the torch tensor to a numpy array.
z = y.numpy()
# ================================================================== #
# 4. Input pipeline #
# ================================================================== #
# Download and construct CIFAR-10 dataset.
train_dataset = torchvision.datasets.CIFAR10(root='../../data/',
train=True,
transform=transforms.ToTensor(),
download=True)
# Fetch one data pair (read data from disk).
image, label = train_dataset[0]
print (image.size())
print (label)
# Data loader (this provides queues and threads in a very simple way).
train_loader = torch.utils.data.DataLoader(dataset=train_dataset,
batch_size=64,
shuffle=True)
# When iteration starts, queue and thread start to load data from files.
data_iter = iter(train_loader)
# Mini-batch images and labels.
images, labels = data_iter.next()
# Actual usage of the data loader is as below.
for images, labels in train_loader:
# Training code should be written here.
pass
# ================================================================== #
# 5. Input pipeline for custom dataset #
# ================================================================== #
# You should build your custom dataset as below.
class CustomDataset(torch.utils.data.Dataset):
def __init__(self):
# TODO
# 1. Initialize file paths or a list of file names.
pass
def __getitem__(self, index):
# TODO
# 1. Read one data from file (e.g. using numpy.fromfile, PIL.Image.open).
# 2. Preprocess the data (e.g. torchvision.Transform).
# 3. Return a data pair (e.g. image and label).
pass
def __len__(self):
# You should change 0 to the total size of your dataset.
return 0
# You can then use the prebuilt data loader.
custom_dataset = CustomDataset()
train_loader = torch.utils.data.DataLoader(dataset=custom_dataset,
batch_size=64,
shuffle=True)
# ================================================================== #
# 6. Pretrained model #
# ================================================================== #
# Download and load the pretrained ResNet-18.
resnet = torchvision.models.resnet18(pretrained=True)
# If you want to finetune only the top layer of the model, set as below.
for param in resnet.parameters():
param.requires_grad = False
# Replace the top layer for finetuning.
resnet.fc = nn.Linear(resnet.fc.in_features, 100) # 100 is an example.
# Forward pass.
images = torch.randn(64, 3, 224, 224)
outputs = resnet(images)
print (outputs.size()) # (64, 100)
# ================================================================== #
# 7. Save and load the model #
# ================================================================== #
# Save and load the entire model.
torch.save(resnet, 'model.ckpt')
model = torch.load('model.ckpt')
# Save and load only the model parameters (recommended).
torch.save(resnet.state_dict(), 'params.ckpt')
resnet.load_state_dict(torch.load('params.ckpt'))