自拍偷在线精品自拍偷,亚洲欧美中文日韩v在线观看不卡

AI.x社區(qū)

軟考社區(qū)

免費(fèi)課

企業(yè)培訓(xùn)

鴻蒙開發(fā)者社區(qū)

WOT技術(shù)大會(huì)

公眾號(hào)矩陣

移動(dòng)端

視頻課免費(fèi)課排行榜短視頻直播課軟考學(xué)堂

全部課程軟考華為認(rèn)證廠商認(rèn)證 IT技術(shù)PMP項(xiàng)目管理免費(fèi)題庫

在線學(xué)習(xí)

文章資源問答課堂專欄直播

51CTO

鴻蒙開發(fā)者社區(qū)

51CTO技術(shù)棧

51CTO官微

51CTO學(xué)堂

51CTO博客

CTO訓(xùn)練營(yíng)

鴻蒙開發(fā)者社區(qū)訂閱號(hào)

51CTO軟考

51CTO學(xué)堂APP

51CTO學(xué)堂企業(yè)版APP

鴻蒙開發(fā)者社區(qū)視頻號(hào)

51CTO軟考題庫

賬號(hào)設(shè)置退出

深度學(xué)習(xí)：自動(dòng)編碼器基礎(chǔ)和類型

作者：機(jī)器之心編譯 2017-09-24 12:13:52

開發(fā) 開發(fā)工具深度學(xué)習(xí)

很顯然，深度學(xué)習(xí)即將對(duì)我們的社會(huì)產(chǎn)生重大顯著的影響。今天我們將進(jìn)一步了解深度學(xué)習(xí)的架構(gòu)類型，并詳細(xì)討論自動(dòng)編碼器。

很顯然，深度學(xué)習(xí)即將對(duì)我們的社會(huì)產(chǎn)生重大顯著的影響。Mobibit 創(chuàng)始人兼 CEO Pramod Chandrayan 近日在 codeburst.io 上發(fā)文對(duì)自動(dòng)編碼器的基礎(chǔ)知識(shí)和類型進(jìn)行了介紹并給出了代碼實(shí)例。

繼續(xù)我之前的文章《深度學(xué)習(xí)：什么&為什么?》(https://goo.gl/Ka3YoF)，今天我們將進(jìn)一步了解深度學(xué)習(xí)的架構(gòu)類型，并詳細(xì)討論自動(dòng)編碼器。

當(dāng)人類大腦與深度學(xué)習(xí)機(jī)器合作時(shí)：

在我們開始揭秘深度網(wǎng)絡(luò)之前，讓我們先定義一下深度學(xué)習(xí)。根據(jù)我的理解：

深度學(xué)習(xí)是一種先進(jìn)的機(jī)器學(xué)習(xí)技術(shù)，其中存在多個(gè)彼此通信的抽象層，每一層都與前一層深度相連，并根據(jù)前一層饋送的輸出進(jìn)行決策。

Investopedia 將深度學(xué)習(xí)定義成：

深度學(xué)習(xí)是人工智能(AI)領(lǐng)域中機(jī)器學(xué)習(xí)中的一個(gè)子集，其有網(wǎng)絡(luò)狀的結(jié)構(gòu)，可以從非結(jié)構(gòu)化或無標(biāo)記的數(shù)據(jù)中以無監(jiān)督的方式學(xué)習(xí)。也被稱為深度神經(jīng)學(xué)習(xí)或深度神經(jīng)網(wǎng)絡(luò)。

今天我們將深入解讀無監(jiān)督預(yù)訓(xùn)練網(wǎng)絡(luò)(Unsupervised Pertained Networks)的工作方式。

UPN：無監(jiān)督預(yù)訓(xùn)練網(wǎng)絡(luò)

這種無監(jiān)督學(xué)習(xí)網(wǎng)絡(luò)可以進(jìn)一步分類成

自動(dòng)編碼器
深度信念網(wǎng)絡(luò)(DBN)
生成對(duì)抗網(wǎng)絡(luò)(GAN)

自動(dòng)編碼器是一種有三層的神經(jīng)網(wǎng)絡(luò)：輸入層、隱藏層(編碼層)和解碼層。該網(wǎng)絡(luò)的目的是重構(gòu)其輸入，使其隱藏層學(xué)習(xí)到該輸入的良好表征。

自動(dòng)編碼器神經(jīng)網(wǎng)絡(luò)是一種無監(jiān)督機(jī)器學(xué)習(xí)算法，其應(yīng)用了反向傳播，可將目標(biāo)值設(shè)置成與輸入值相等。自動(dòng)編碼器的訓(xùn)練目標(biāo)是將輸入復(fù)制到輸出。在內(nèi)部，它有一個(gè)描述用于表征其輸入的代碼的隱藏層。

自動(dòng)編碼器的目標(biāo)是學(xué)習(xí)函數(shù) h(x)≈x。換句話說，它要學(xué)習(xí)一個(gè)近似的恒等函數(shù)，使得輸出 x^ 近似等于輸入 x。自動(dòng)編碼器屬于神經(jīng)網(wǎng)絡(luò)家族，但它們也和 PCA(主成分分析)緊密相關(guān)。

關(guān)于自動(dòng)編碼器的一些關(guān)鍵事實(shí)：

它是一種類似于 PCA 的無監(jiān)督機(jī)器學(xué)習(xí)算法
它要最小化和 PCA 一樣的目標(biāo)函數(shù)
它是一種神經(jīng)網(wǎng)絡(luò)
這種神經(jīng)網(wǎng)絡(luò)的目標(biāo)輸出就是其輸入

盡管自動(dòng)編碼器與 PCA 很相似，但自動(dòng)編碼器比 PCA 靈活得多。在編碼過程中，自動(dòng)編碼器既能表征線性變換，也能表征非線性變換;而 PCA 只能執(zhí)行線性變換。因?yàn)樽詣?dòng)編碼器的網(wǎng)絡(luò)表征形式，所以可將其作為層用于構(gòu)建深度學(xué)習(xí)網(wǎng)絡(luò)。

自動(dòng)編碼器的類型：

去噪自動(dòng)編碼器
稀疏自動(dòng)編碼器
變分自動(dòng)編碼器(VAE)
收縮自動(dòng)編碼器(CAE/contractive autoencoder)

A. 去噪自動(dòng)編碼器

這是最基本的一種自動(dòng)編碼器，它會(huì)隨機(jī)地部分采用受損的輸入來解決恒等函數(shù)風(fēng)險(xiǎn)，使得自動(dòng)編碼器必須進(jìn)行恢復(fù)或去噪。

這項(xiàng)技術(shù)可用于得到輸入的良好表征。良好的表征是指可以從受損的輸入穩(wěn)健地獲得的表征，該表征可被用于恢復(fù)其對(duì)應(yīng)的無噪聲輸入。

去噪自動(dòng)編碼器背后的思想很簡(jiǎn)單。為了迫使隱藏層發(fā)現(xiàn)更加穩(wěn)健的特征并且為了防止其只是學(xué)習(xí)其中的恒等關(guān)系，我們?cè)谟?xùn)練自動(dòng)編碼器時(shí)會(huì)讓其從受損的版本中重建輸入。

應(yīng)用在輸入上的噪聲量以百分比的形式呈現(xiàn)。一般來說，30% 或 0.3 就很好，但如果你的數(shù)據(jù)非常少，你可能就需要考慮增加更多噪聲。

堆疊的去噪自動(dòng)編碼器(SDA)：

這是一種在層上使用了無監(jiān)督預(yù)訓(xùn)練機(jī)制的去噪自編碼器，其中當(dāng)一層被預(yù)訓(xùn)練用于在之前層的輸入上執(zhí)行特征選擇和特征提取后，后面會(huì)跟上一個(gè)監(jiān)督式的微調(diào)(fine-tuning)階段。SDA 只是將很多個(gè)去噪自動(dòng)編碼器融合在了一起。一旦前面 k 層訓(xùn)練完成，我們就可以訓(xùn)練第 k+1 層，因?yàn)槲覀儸F(xiàn)在可以根據(jù)下面的層計(jì)算代碼或隱含表征。

一旦所有層都預(yù)訓(xùn)練完成，網(wǎng)絡(luò)就會(huì)進(jìn)入一個(gè)被稱為微調(diào)的階段。在這里我們會(huì)為微調(diào)使用監(jiān)督學(xué)習(xí)機(jī)制，以最小化被監(jiān)督任務(wù)上的預(yù)測(cè)誤差。然后，我們以訓(xùn)練多層感知器的方式訓(xùn)練整個(gè)網(wǎng)絡(luò)。在這個(gè)階段，我們僅考慮每個(gè)自動(dòng)編碼器的編碼部分。這個(gè)階段是有監(jiān)督的，自此以后我們就在訓(xùn)練中使用目標(biāo)類別了。

使用代碼示例解釋 SDA

這一節(jié)源自 deeplearning.net(對(duì)于想要理解深度學(xué)習(xí)的人來說，這個(gè)網(wǎng)站提供了很好的參考)，其中使用案例對(duì)堆疊的去噪自動(dòng)編碼器進(jìn)行了很好的解釋。

我們可以以兩種方式看待堆疊的去噪自動(dòng)編碼器：一是自動(dòng)編碼器列表，二是多層感知器(MLP)。在預(yù)訓(xùn)練過程中，我們使用了***種方式，即我們將我們的模型看作是一組自動(dòng)編碼器列表，并分開訓(xùn)練每個(gè)自動(dòng)編碼器。在第二個(gè)訓(xùn)練階段，我們使用第二種方式。這兩種方式是有聯(lián)系的，因?yàn)椋?/p>

自動(dòng)編碼器和 MLP 的 sigmoid 層共享參數(shù);

MLP 的中間層計(jì)算出的隱含表征被用作自動(dòng)編碼器的輸入。

class SdA(object):  
"""Stacked denoising auto-encoder class (SdA)  
A stacked denoising autoencoder model is obtained by stacking several  
dAs. The hidden layer of the dA at layer `i` becomes the input of  
the dA at layer `i+1`. The first layer dA gets as input the input of  
the SdA, and the hidden layer of the last dA represents the output.  
Note that after pretraining, the SdA is dealt with as a normal MLP,  
the dAs are only used to initialize the weights.  
""" 
def __init__(  
self,  
numpy_rng,  
theano_rng=None,  
n_ins=784,  
hidden_layers_sizes=[500, 500],  
n_outs=10,  
corruption_levels=[0.1, 0.1]  
):  
""" This class is made to support a variable number of layers. 
:type numpy_rng: numpy.random.RandomState 
:param numpy_rng: numpy random number generator used to draw initial  
weights  
:type theano_rng: theano.tensor.shared_randomstreams.RandomStreams  
:param theano_rng: Theano random generator; if None is given one is  
generated based on a seed drawn from `rng`  
:type n_ins: int 
:param n_ins: dimension of the input to the sdA  
:type hidden_layers_sizes: list of ints  
:param hidden_layers_sizes: intermediate layers size, must contain  
at least one value  
:type n_outs: int  
:param n_outs: dimension of the output of the network  
:type corruption_levels: list of float  
:param corruption_levels: amount of corruption to use for each  
layer  
"""  
self.sigmoid_layers = [] 
self.dA_layers = []  
self.params = []  
self.n_layers = len(hidden_layers_sizes)  
assert self.n_layers > 0 
if not theano_rng: 
theano_rng = RandomStreams(numpy_rng.randint(2 ** 30))  
# allocate symbolic variables for the data  
self.x = T.matrix('x') # the data is presented as rasterized images  
self.y = T.ivector('y') # the labels are presented as 1D vector of  
# [int] labels

self.sigmoid_layers 將會(huì)存儲(chǔ) MLP 形式的 sigmoid 層，而 self.dA_layers 將會(huì)存儲(chǔ)與該 MLP 層關(guān)聯(lián)的去噪自動(dòng)編碼器。接下來，我們構(gòu)建 n_layers sigmoid 層和 n_layers 去噪自動(dòng)編碼器，其中 n_layers 是我們的模型的深度。我們使用了多層感知器中引入的 HiddenLayer 類，但有一項(xiàng)修改：我們將 tanh 非線性替換成了 logistic 函數(shù)

我們鏈接了 sigmoid 層來構(gòu)建一個(gè) MLP，而且我們?cè)跇?gòu)建自動(dòng)編碼器時(shí)使得每個(gè)自動(dòng)編碼器的編碼部分都與其對(duì)應(yīng)的 sigmoid 層共享權(quán)重矩陣和偏置。

for i in range(self.n_layers):  
# construct the sigmoidal layer  
# the size of the input is either the number of hidden units of  
# the layer below or the input size if we are on the first layer  
if i == 0:  
input_size = n_ins  
else:  
input_size = hidden_layers_sizes[i - 1]  
# the input to this layer is either the activation of the hidden  
# layer below or the input of the SdA if you are on the first  
# layer  
if i == 0:  
layer_input = self.x  
else:  
layer_input = self.sigmoid_layers[-1].output  
sigmoid_layer = HiddenLayer(rng=numpy_rng,  
input=layer_input,  
n_in=input_size,  
n_out=hidden_layers_sizes[i],  
activation=T.nnet.sigmoid)  
# add the layer to our list of layers  
self.sigmoid_layers.append(sigmoid_layer)  
# its arguably a philosophical question... 
# but we are going to only declare that the parameters of the 
# sigmoid_layers are parameters of the StackedDAA 
# the visible biases in the dA are parameters of those  
# dA, but not the SdA  
self.params.extend(sigmoid_layer.params)  
# Construct a denoising autoencoder that shared weights with this  
# layer  
dAdA_layer = dA(numpy_rngnumpy_rng=numpy_rng,  
theano_rngtheano_rng=theano_rng,  
input=layer_input,  
n_visible=input_size,  
n_hidden=hidden_layers_sizes[i],  
W=sigmoid_layer.W,  
bhid=sigmoid_layer.b)  
self.dA_layers.append(dA_layer)

現(xiàn)在我們只需要在這個(gè) sigmoid 層上添加一個(gè) logistic 層即可，這樣我們就有了一個(gè) MLP。我們將使用 LogisticRegression 類，這個(gè)類是在使用 logistic 回歸分類 MNIST 數(shù)字時(shí)引入的。

# We now need to add a logistic layer on top of the MLP  
self.logLayer = LogisticRegression(  
input=self.sigmoid_layers[-1].output, 
n_in=hidden_layers_sizes[-1],  
n_out=n_outs  
) 
self.params.extend(self.logLayer.params)  
# construct a function that implements one step of finetunining  
# compute the cost for second phase of training,  
# defined as the negative log likelihood  
selfself.finetune_cost = self.logLayer.negative_log_likelihood(self.y) 
# compute the gradients with respect to the model parameters 
# symbolic variable that points to the number of errors made on the  
# minibatch given by self.x and self.y  
selfself.errors = self.logLayer.errors(self.y)

SdA 類也提供了一種為其層中的去噪自動(dòng)編碼器生成訓(xùn)練函數(shù)的方法。它們會(huì)作為一個(gè)列表返回，其中元素 i 是一個(gè)函數(shù)——該函數(shù)實(shí)現(xiàn)了訓(xùn)練對(duì)應(yīng)于第 i 層的 dA 的步驟。

def pretraining_functions(self, train_set_x, batch_size):  
''' Generates a list of functions, each of them implementing one  
step in trainnig the dA corresponding to the layer with same index.  
The function will require as input the minibatch index, and to train  
a dA you just need to iterate, calling the corresponding function on 
all minibatch indexes. 
:type train_set_x: theano.tensor.TensorType  
:param train_set_x: Shared variable that contains all datapoints used  
for training the dA  
:type batch_size: int  
:param batch_size: size of a [mini]batch 
:type learning_rate: float  
:param learning_rate: learning rate used during training for any of  
the dA layers 
'''  
# index to a [mini]batch  
index = T.lscalar('index') # index to a minibatch

為了修改訓(xùn)練過程中的受損水平或?qū)W習(xí)率，我們將它們與 Theano 變量聯(lián)系了起來。

corruption_level = T.scalar('corruption') # % of corruption to use  
learning_rate = T.scalar('lr') # learning rate to use  
# begining of a batch, given `index`  
batch_begin = index * batch_size  
# ending of a batch given `index`  
batch_end = batch_begin + batch_size  
pretrain_fns = [] 
for dA in self.dA_layers:  
# get the cost and the updates list  
cost, updates = dA.get_cost_updates(corruption_level,  
learning_rate)  
# compile the theano function  
fn = theano.function(  
inputs=[  
index,  
theano.In(corruption_level, value=0.2),  
theano.In(learning_rate, value=0.1)  
],  
outputs=cost,  
updatesupdates=updates,  
givens={  
self.x: train_set_x[batch_begin: batch_end]  
}  
)  
# append `fn` to the list of functions  
pretrain_fns.append(fn)  
return pretrain_fns

現(xiàn)在任意 pretrain_fns[i] 函數(shù)都可以使用索引參數(shù)了，可選的有 corruption(受損水平)或 lr(學(xué)習(xí)率)。注意這些參數(shù)名是在它們被構(gòu)建時(shí)賦予 Theano 變量的名字，而不是 Python 變量(learning_rate 或 corruption_level)的名字。在使用 Theano 時(shí)一定要記住這一點(diǎn)。我們用同樣的方式構(gòu)建了用于構(gòu)建微調(diào)過程中所需函數(shù)的方法(train_fn、valid_score 和 test_score)。

def build_finetune_functions(self, datasets, batch_size, learning_rate):  
'''Generates a function `train` that implements one step of  
finetuning, a function `validate` that computes the error on  
a batch from the validation set, and a function `test` that  
computes the error on a batch from the testing set  
:type datasets: list of pairs of theano.tensor.TensorType 
:param datasets: It is a list that contain all the datasets;  
the has to contain three pairs, `train`,  
`valid`, `test` in this order, where each pair  
is formed of two Theano variables, one for the  
datapoints, the other for the labels  
:type batch_size: int  
:param batch_size: size of a minibatch 
:type learning_rate: float 
:param learning_rate: learning rate used during finetune stage  
'''  
(train_set_x, train_set_y) = datasets[0]  
(valid_set_x, valid_set_y) = datasets[1]  
(test_set_x, test_set_y) = datasets[2]  
# compute number of minibatches for training, validation and testing  
n_valid_batches = valid_set_x.get_value(borrow=True).shape[0]  
n_valid_batches //= batch_size  
n_test_batches = test_set_x.get_value(borrow=True).shape[0]  
n_test_batches //= batch_size  
index = T.lscalar('index') # index to a [mini]batch  
# compute the gradients with respect to the model parameters  
gparams = T.grad(self.finetune_cost, self.params)  
# compute list of fine-tuning updates  
updates = [  
(param, param - gparam * learning_rate)  
for param, gparam in zip(self.params, gparams)  
]  
train_fn = theano.function(  
inputs=[index],  
outputs=self.finetune_cost,  
updatesupdates=updates,  
givens={  
self.x: train_set_x[ 
index * batch_size: (index + 1) * batch_size  
],  
self.y: train_set_y[  
index * batch_size: (index + 1) * batch_size  
]  
},  
)  
test_score_i = theano.function(  
[index],  
self.errors,  
givens={  
self.x: test_set_x[  
index * batch_size: (index + 1) * batch_size  
],  
self.y: test_set_y[  
index * batch_size: (index + 1) * batch_size  
]  
},  
)  
valid_score_i = theano.function(  
[index],  
self.errors,  
givens={  
self.x: valid_set_x[  
index * batch_size: (index + 1) * batch_size  
],  
self.y: valid_set_y[  
index * batch_size: (index + 1) * batch_size  
]  
},  
)  
# Create a function that scans the entire validation set  
def valid_score():  
return [valid_score_i(i) for i in range(n_valid_batches)]  
# Create a function that scans the entire test set  
def test_score(): 
return [test_score_i(i) for i in range(n_test_batches)]  
return train_fn, valid_score, test_score

注意，valid_score 和 test_score 并不是 Theano 函數(shù)，而是分別在整個(gè)驗(yàn)證集和整個(gè)測(cè)試集上循環(huán)的 Python 函數(shù)，可以在這些集合上產(chǎn)生一個(gè)損失列表。

總結(jié)

下面給出的幾行代碼就構(gòu)建了一個(gè)堆疊的去噪自動(dòng)編碼器：

numpynumpy_rng = numpy.random.RandomState(89677)  
print('... building the model')  
# construct the stacked denoising autoencoder class  
sda = SdA(  
numpy_rngnumpy_rng=numpy_rng,  
n_ins=28 * 28,  
hidden_layers_sizes=[1000, 1000, 1000],  
n_outs=10  
)

該網(wǎng)絡(luò)的訓(xùn)練分兩個(gè)階段：逐層的預(yù)訓(xùn)練，之后是微調(diào)。

對(duì)于預(yù)訓(xùn)練階段，我們將在網(wǎng)絡(luò)的所有層上進(jìn)行循環(huán)。對(duì)于每個(gè)層，我們都將使用編譯過的實(shí)現(xiàn) SGD 步驟的函數(shù)，以優(yōu)化權(quán)重，從而降低該層的重構(gòu)成本。這個(gè)函數(shù)將根據(jù) pretraining_epochs 在訓(xùn)練集上執(zhí)行固定數(shù)量的 epoch。

######################### 
# PRETRAINING THE MODEL #  
#########################  
print('... getting the pretraining functions')  
pretraining_fns = sda.pretraining_functions(train_set_xtrain_set_x=train_set_x, 
batch_sizebatch_size=batch_size)  
print('... pre-training the model')  
start_time = timeit.default_timer()  
## Pre-train layer-wise  
corruption_levels = [.1, .2, .3]  
for i in range(sda.n_layers):  
# go through pretraining epochs  
for epoch in range(pretraining_epochs): 
# go through the training set 
c = [] 
for batch_index in range(n_train_batches): 
c.append(pretraining_fns[i](index=batch_index,  
corruption=corruption_levels[i],  
lr=pretrain_lr))  
print('Pre-training layer %i, epoch %d, cost %f' % (i, epoch, numpy.mean(c, dtype='float64')))  
end_time = timeit.default_timer() 
print(('The pretraining code for file ' + 
os.path.split(__file__)[1] + 
' ran for %.2fm' % ((end_time - start_time) / 60.)), file=sys.stderr)

這里的微調(diào)循環(huán)和多層感知器中的微調(diào)過程很相似。唯一的區(qū)別是它使用了 build_finetune_functions 給出的函數(shù)。

執(zhí)行代碼

用戶可以通過調(diào)用以下 Python CLI 來運(yùn)行該代碼：

python code/SdA.py

默認(rèn)情況下，該代碼會(huì)為每一層運(yùn)行 15 次預(yù)訓(xùn)練 epoch，其批大小為 1。***層的受損水平為 0.1，第二層為 0.2，第三層為 0.3。預(yù)訓(xùn)練的學(xué)習(xí)率為 0.001，微調(diào)學(xué)習(xí)率為 0.1。預(yù)訓(xùn)練耗時(shí) 585.01 分鐘，每 epoch 平均 13 分鐘。微調(diào)經(jīng)歷了 36 epoch，耗時(shí) 444.2 分鐘，每 epoch 平均 12.34 分鐘。***的驗(yàn)證分?jǐn)?shù)是 1.39%，測(cè)試分?jǐn)?shù)是 1.3%。這些結(jié)果是在配置了 Intel Xeon E5430 @ 2.66GHz CPU 的機(jī)器上得到的，它有單線程的 GotoBLAS。

原文：https://codeburst.io/deep-learning-types-and-autoencoders-a40ee6754663

【本文是51CTO專欄機(jī)構(gòu)“機(jī)器之心”的原創(chuàng)譯文，微信公眾號(hào)“機(jī)器之心( id: almosthuman2014)”】

戳這里，看該作者更多好文

責(zé)任編輯：趙寧寧來源： 51CTO專欄

深度學(xué)習(xí)自動(dòng)編碼器機(jī)器學(xué)習(xí)

點(diǎn)贊

51CTO技術(shù)棧公眾號(hào)

業(yè)務(wù)
速覽

媒體

51CTO CIOAge HC3i

社區(qū)

51CTO博客鴻蒙開發(fā)者社區(qū) AI.x社區(qū)

教育

51CTO學(xué)堂精培企業(yè)培訓(xùn) CTO訓(xùn)練營(yíng)

<blockquote id="mmg0b"><rt id="mmg0b"></rt></blockquote>

<mark id="mmg0b"><big id="mmg0b"></big></mark>

<legend id="mmg0b"><track id="mmg0b"></track></legend>

<sup id="mmg0b"><rt id="mmg0b"></rt></sup>