機(jī)器學(xué)習(xí) | PyTorch簡(jiǎn)明教程上篇
前面幾篇文章介紹了特征歸一化和張量,接下來(lái)開始寫兩篇PyTorch簡(jiǎn)明教程,主要介紹PyTorch簡(jiǎn)單實(shí)踐。
1、四則運(yùn)算
import torch
a = torch.tensor([2, 3, 4])
b = torch.tensor([3, 4, 5])
print("a + b: ", (a + b).numpy())
print("a - b: ", (a - b).numpy())
print("a * b: ", (a * b).numpy())
print("a / b: ", (a / b).numpy())
加減乘除就不用多解釋了,輸出為:
a + b: [5 7 9]
a - b: [-1 -1 -1]
a * b: [ 6 12 20]
a / b: [0.6666667 0.75 0.8 ]
2、線性回歸
線性回歸是找到一條直線盡可能接近已知點(diǎn),如圖:
圖1
import torch
from torch import optim
def build_model1():
return torch.nn.Sequential(
torch.nn.Linear(1, 1, bias=False)
)
def build_model2():
model = torch.nn.Sequential()
model.add_module("linear", torch.nn.Linear(1, 1, bias=False))
return model
def train(model, loss, optimizer, x, y):
model.train()
optimizer.zero_grad()
fx = model.forward(x.view(len(x), 1)).squeeze()
output = loss.forward(fx, y)
output.backward()
optimizer.step()
return output.item()
def main():
torch.manual_seed(42)
X = torch.linspace(-1, 1, 101, requires_grad=False)
Y = 2 * X + torch.randn(X.size()) * 0.33
print("X: ", X.numpy(), ", Y: ", Y.numpy())
model = build_model1()
loss = torch.nn.MSELoss(reductinotallow='mean')
optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.9)
batch_size = 10
for i in range(100):
cost = 0.
num_batches = len(X) // batch_size
for k in range(num_batches):
start, end = k * batch_size, (k + 1) * batch_size
cost += train(model, loss, optimizer, X[start:end], Y[start:end])
print("Epoch = %d, cost = %s" % (i + 1, cost / num_batches))
w = next(model.parameters()).data
print("w = %.2f" % w.numpy())
if __name__ == "__main__":
main()
(1)先從main函數(shù)開始,torch.manual_seed(42)用于設(shè)置隨機(jī)數(shù)生成器的種子,以確保在每次運(yùn)行時(shí)生成的隨機(jī)數(shù)序列相同,該函數(shù)接受一個(gè)整數(shù)參數(shù)作為種子,可以在訓(xùn)練神經(jīng)網(wǎng)絡(luò)等需要隨機(jī)數(shù)的場(chǎng)景中使用,以確保結(jié)果的可重復(fù)性;
(2)torch.linspace(-1, 1, 101, requires_grad=False)用于在指定的區(qū)間內(nèi)生成一組等間隔的數(shù)值,該函數(shù)接受三個(gè)參數(shù):起始值、終止值和元素個(gè)數(shù),返回一個(gè)張量,其中包含了指定個(gè)數(shù)的等間隔數(shù)值;
(3)build_model1內(nèi)部實(shí)現(xiàn):
- torch.nn.Sequential(torch.nn.Linear(1, 1, bias=False))中使用nn.Sequential類的構(gòu)造函數(shù),將線性層作為參數(shù)傳遞給它,然后返回一個(gè)包含該線性層的神經(jīng)網(wǎng)絡(luò)模型;
- build_model2和build_model1功能一樣,使用add_module()方法向其中添加了一個(gè)名為linear的子模塊;
(4)torch.nn.MSELoss(reductinotallow='mean')定義損失函數(shù);
(5)optim.SGD(model.parameters(), lr=0.01, momentum=0.9)實(shí)現(xiàn)隨機(jī)梯度下降(Stochastic Gradient Descent,SGD)優(yōu)化算法;
(6)通過(guò)batch_size將訓(xùn)練集拆分,循環(huán)100次;
(7)接下來(lái)是訓(xùn)練函數(shù)train,用于訓(xùn)練一個(gè)神經(jīng)網(wǎng)絡(luò)模型,具體來(lái)說(shuō),該函數(shù)接受以下參數(shù):
- model:神經(jīng)網(wǎng)絡(luò)模型,通常是一個(gè)繼承自nn.Module的類的實(shí)例;
- loss:損失函數(shù),用于計(jì)算模型的預(yù)測(cè)值與真實(shí)值之間的差異;
- optimizer:優(yōu)化器,用于更新模型的參數(shù);
- x:輸入數(shù)據(jù),是一個(gè)torch.Tensor類型的張量;
- y:目標(biāo)數(shù)據(jù),是一個(gè)torch.Tensor類型的張量;
(8)train是PyTorch訓(xùn)練步驟的通用方法,步驟如下:
- 將模型設(shè)置為訓(xùn)練模式,即啟用dropout和batch normalization等訓(xùn)練時(shí)使用的特殊操作;
- 將優(yōu)化器的梯度緩存清零,以便進(jìn)行新一輪的梯度計(jì)算;
- 將輸入數(shù)據(jù)傳遞給模型,計(jì)算模型的預(yù)測(cè)值,并將預(yù)測(cè)值與目標(biāo)數(shù)據(jù)傳遞給損失函數(shù),計(jì)算損失值;
- 對(duì)損失值進(jìn)行反向傳播,計(jì)算模型參數(shù)的梯度;
- 使用優(yōu)化器更新模型參數(shù),以最小化損失值;
- 返回?fù)p失值的標(biāo)量值;
(9)print("Epoch = %d, cost = %s" % (i + 1, cost / num_batches))最后打印當(dāng)前訓(xùn)練的輪次和損失值,上述的代碼輸出如下:
...
Epoch = 95, cost = 0.10514946877956391
Epoch = 96, cost = 0.10514946877956391
Epoch = 97, cost = 0.10514946877956391
Epoch = 98, cost = 0.10514946877956391
Epoch = 99, cost = 0.10514946877956391
Epoch = 100, cost = 0.10514946877956391
w = 1.98
3、邏輯回歸
邏輯回歸即用一根曲線近似表示一堆離散點(diǎn)的軌跡,如圖:
圖2
import numpy as np
import torch
from torch import optim
from data_util import load_mnist
def build_model(input_dim, output_dim):
return torch.nn.Sequential(
torch.nn.Linear(
input_dim, output_dim, bias=False)
)
def train(model, loss, optimizer, x_val, y_val):
model.train()
optimizer.zero_grad()
fx = model.forward(x_val)
output = loss.forward(fx, y_val)
output.backward()
optimizer.step()
return output.item()
def predict(model, x_val):
model.eval()
output = model.forward(x_val)
return output.data.numpy().argmax(axis=1)
def main():
torch.manual_seed(42)
trX, teX, trY, teY = load_mnist(notallow=False)
trX = torch.from_numpy(trX).float()
teX = torch.from_numpy(teX).float()
trY = torch.tensor(trY)
n_examples, n_features = trX.size()
n_classes = 10
model = build_model(n_features, n_classes)
loss = torch.nn.CrossEntropyLoss(reductinotallow='mean')
optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.9)
batch_size = 100
for i in range(100):
cost = 0.
num_batches = n_examples // batch_size
for k in range(num_batches):
start, end = k * batch_size, (k + 1) * batch_size
cost += train(model, loss, optimizer,
trX[start:end], trY[start:end])
predY = predict(model, teX)
print("Epoch %d, cost = %f, acc = %.2f%%"
% (i + 1, cost / num_batches, 100. * np.mean(predY == teY)))
if __name__ == "__main__":
main()
(1)先從main函數(shù)開始,torch.manual_seed(42)上面有介紹,在此略過(guò);
(2)load_mnist是自己實(shí)現(xiàn)下載mnist數(shù)據(jù)集,返回trX和teX是輸入數(shù)據(jù),trY和teY是標(biāo)簽數(shù)據(jù);
(3)build_model內(nèi)部實(shí)現(xiàn):torch.nn.Sequential(torch.nn.Linear(input_dim, output_dim, bias=False))用于構(gòu)建一個(gè)包含一個(gè)線性層的神經(jīng)網(wǎng)絡(luò)模型,模型的輸入特征數(shù)量為input_dim,輸出特征數(shù)量為output_dim,且該線性層沒(méi)有偏置項(xiàng),其中n_classes=10表示輸出10個(gè)分類;
(4)其他的步驟就是定義損失函數(shù),梯度下降優(yōu)化器,通過(guò)batch_size將訓(xùn)練集拆分,循環(huán)100次進(jìn)行train;
(5)optim.SGD(model.parameters(), lr=0.01, momentum=0.9)實(shí)現(xiàn)隨機(jī)梯度下降(Stochastic Gradient Descent,SGD)優(yōu)化算法;
(6)每一輪訓(xùn)練完成后,執(zhí)行predict,該函數(shù)接受兩個(gè)參數(shù)model(訓(xùn)練好的模型)和teX(需要預(yù)測(cè)的數(shù)據(jù)),步驟如下:
- model.eval()模型設(shè)置為評(píng)估模式,這意味著模型將不會(huì)進(jìn)行訓(xùn)練,而是僅用于推理;
- 將output轉(zhuǎn)換為NumPy數(shù)組,并使用argmax()方法獲取每個(gè)樣本的預(yù)測(cè)類別;
(7)print("Epoch %d, cost = %f, acc = %.2f%%" % (i + 1, cost / num_batches, 100. * np.mean(predY == teY)))最后打印當(dāng)前訓(xùn)練的輪次,損失值和acc,上述的代碼輸出如下(執(zhí)行很快,但是準(zhǔn)確率偏低):
...
Epoch 91, cost = 0.252863, acc = 92.52%
Epoch 92, cost = 0.252717, acc = 92.51%
Epoch 93, cost = 0.252573, acc = 92.50%
Epoch 94, cost = 0.252431, acc = 92.50%
Epoch 95, cost = 0.252291, acc = 92.52%
Epoch 96, cost = 0.252153, acc = 92.52%
Epoch 97, cost = 0.252016, acc = 92.51%
Epoch 98, cost = 0.251882, acc = 92.51%
Epoch 99, cost = 0.251749, acc = 92.51%
Epoch 100, cost = 0.251617, acc = 92.51%
4、神經(jīng)網(wǎng)絡(luò)
一個(gè)經(jīng)典的LeNet網(wǎng)絡(luò),用于對(duì)字符進(jìn)行分類,如圖:
圖3
- 定義一個(gè)多層的神經(jīng)網(wǎng)絡(luò)
- 對(duì)數(shù)據(jù)集的預(yù)處理并準(zhǔn)備作為網(wǎng)絡(luò)的輸入
- 將數(shù)據(jù)輸入到網(wǎng)絡(luò)
- 計(jì)算網(wǎng)絡(luò)的損失
- 反向傳播,計(jì)算梯度
import numpy as np
import torch
from torch import optim
from data_util import load_mnist
def build_model(input_dim, output_dim):
return torch.nn.Sequential(
torch.nn.Linear(input_dim, 512, bias=False),
torch.nn.Sigmoid(),
torch.nn.Linear(512, output_dim, bias=False)
)
def train(model, loss, optimizer, x_val, y_val):
model.train()
optimizer.zero_grad()
fx = model.forward(x_val)
output = loss.forward(fx, y_val)
output.backward()
optimizer.step()
return output.item()
def predict(model, x_val):
model.eval()
output = model.forward(x_val)
return output.data.numpy().argmax(axis=1)
def main():
torch.manual_seed(42)
trX, teX, trY, teY = load_mnist(notallow=False)
trX = torch.from_numpy(trX).float()
teX = torch.from_numpy(teX).float()
trY = torch.tensor(trY)
n_examples, n_features = trX.size()
n_classes = 10
model = build_model(n_features, n_classes)
loss = torch.nn.CrossEntropyLoss(reductinotallow='mean')
optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.9)
batch_size = 100
for i in range(100):
cost = 0.
num_batches = n_examples // batch_size
for k in range(num_batches):
start, end = k * batch_size, (k + 1) * batch_size
cost += train(model, loss, optimizer,
trX[start:end], trY[start:end])
predY = predict(model, teX)
print("Epoch %d, cost = %f, acc = %.2f%%"
% (i + 1, cost / num_batches, 100. * np.mean(predY == teY)))
if __name__ == "__main__":
main()
(1)以上這段神經(jīng)網(wǎng)絡(luò)的代碼與邏輯回歸沒(méi)有太多的差異,區(qū)別的地方是build_model,這里是構(gòu)建一個(gè)包含兩個(gè)線性層和一個(gè)Sigmoid激活函數(shù)的神經(jīng)網(wǎng)絡(luò)模型,該模型包含一個(gè)輸入特征數(shù)量為input_dim,輸出特征數(shù)量為output_dim的線性層,一個(gè)Sigmoid激活函數(shù),以及一個(gè)輸入特征數(shù)量為512,輸出特征數(shù)量為output_dim的線性層;
(2)print("Epoch %d, cost = %f, acc = %.2f%%" % (i + 1, cost / num_batches, 100. * np.mean(predY == teY)))最后打印當(dāng)前訓(xùn)練的輪次,損失值和acc,上述的代碼輸入如下(執(zhí)行時(shí)間比邏輯回歸要長(zhǎng),但是準(zhǔn)確率要高很多):
...
Epoch 91, cost = 0.054484, acc = 97.58%
Epoch 92, cost = 0.053753, acc = 97.56%
Epoch 93, cost = 0.053036, acc = 97.60%
Epoch 94, cost = 0.052332, acc = 97.61%
Epoch 95, cost = 0.051641, acc = 97.63%
Epoch 96, cost = 0.050964, acc = 97.66%
Epoch 97, cost = 0.050298, acc = 97.66%
Epoch 98, cost = 0.049645, acc = 97.67%
Epoch 99, cost = 0.049003, acc = 97.67%
Epoch 100, cost = 0.048373, acc = 97.68%