Alexnet 卷積神經網路 純 matlab程式碼 底層實現(一)正向傳播
本程式的主要目的是通過自主編寫alexnet網路,深入瞭解卷積、pooling、區域性區域正則化、全連線層、基於動量的隨機梯度下降、卷積神經網路的引數修正演算法等神經網路的基礎知識。
寫這個部落格一方面幫助自己理清思路,另一方面督促自己加快進度,也希望可以幫助到同樣想深入瞭解卷積神經網路運算機制的朋友們。
根據前人們對Alexnet深入淺出的分析,基本可以完整還原整個網路結構。但還是遇到了一個問題:第一層卷積結果厚度是96,為什麼到第二層時卷積核的厚度是48呢?
由於GPU視訊記憶體資源的限制,Alexnet在卷積的過程中講卷積核分成兩組,由兩個GPU進行運算。96 是兩塊GPU中卷積核數量的總和,其中每塊GPU卷積核的數量都是48。所以輸出的資料是兩個厚度為48的tensor。由此得出結論:
分組對卷積核引數數量有很大影響。例如 第一層到第二層的過程,第一層有96個的卷積核,分成兩組後,每組有48個卷積核,經過pooling層後,輸出兩個27*27*28大小的tensor,此時第二層的卷積核厚度為48,個數為128+128,第二層卷積引數數量為(5*5*48*128+128)*2=307456。如果不分組計算,輸出的第一層輸出的tensor大小為27*27*96,第二層參的卷積核厚度為96,個數為256,第二層卷積引數數量為5*5*96*256+256=614656.數量幾乎增加1倍。
這裡給出Alexnet結構程式碼,其中kernalcell中存放卷積核引數,bias存放偏置。
function [ output,kernalcell,biasout,Edeltak,Edeltab ] = myalexnet( X,label,kernalcell,bias,Edeltak,Edeltab ) %1st Layer: Conv (w ReLu) -> Lrn -> Pool conv1_1 = conv(X, kernalcell{1}{1},bias{1}{1},0,4, 4); conv1_2 = conv(X, kernalcell{1}{2}, bias{1}{2},0,4, 4);%55*55*48 norm1_1 = local_response_norm(conv1_1, 2,1, 2e-05, 0.75); norm1_2 = local_response_norm(conv1_2, 2,1, 2e-05, 0.75); pool1_1 = max_pool(norm1_1, 3, 3, 2, 2); pool1_2 = max_pool(norm1_2, 3, 3, 2, 2);%27*27*48 %2nd Layer: Conv (w ReLu) -> Lrn -> Pool conv2_1 = conv(pool1_1, kernalcell{2}{1}, bias{2}{1}, 2, 1, 1); conv2_2 = conv(pool1_2, kernalcell{2}{2}, bias{2}{2}, 2, 1, 1);%27*27*128 conv2 = cat(3,conv2_1,conv2_2); norm2 = local_response_norm(conv2, 2,1, 2e-05, 0.75); pool2 = max_pool(norm2, 3, 3, 2, 2);%13*13*256*b %3rd Layer: Conv (w ReLu) conv3_1 = conv(pool2, kernalcell{3}{1}, bias{3}{1},1,1, 1); conv3_2 = conv(pool2, kernalcell{3}{2},bias{3}{2},1,1, 1);%13*13*192*b %4th Layer: Conv (w ReLu) splitted into two groups conv4_1 = conv(conv3_1, kernalcell{4}{1},bias{4}{1},1, 1, 1); conv4_2 = conv(conv3_2, kernalcell{4}{2},bias{4}{2},1, 1, 1);%13*13*192*b %5th Layer: Conv (w ReLu) -> Pool splitted into two groups conv5_1 = conv(conv4_1, kernalcell{5}{1},bias{5}{1}, 1,1, 1); conv5_2 = conv(conv4_2, kernalcell{5}{2},bias{5}{2}, 1,1, 1); %輸出為13*13*128*batch conv5 = cat(3,conv5_1,conv5_2);%13*13*256*b pool5 = max_pool(conv5, 3, 3, 2, 2);%6*6*256*batch % 6th Layer: Flatten -> FC (w ReLu) -> Dropout size_pool5.a=size(pool5,1); size_pool5.b=size(pool5,2); size_pool5.c=size(pool5,3); pool5=reshape(pool5,1,size_pool5.a*size_pool5.b*size_pool5.c,batch);%batch=size(pool5,4) pool5:9216*batch fc6 = fc(pool5, kernalcell{6},bias{6});%1*4096*b %dropout6=dropout(conv6); %7th Layer: FC (w ReLu) -> Dropout %conv6=rand(1,1,4096); fc7 = fc(fc6,kernalcell{7},bias{7}); %1*4096*batch % dropout7 = dropout(fc7); %8th Layer: FC and return unscaled activations fc8 = fc(fc7, kernalcell{8},bias{8}); %softmax Layer output=zeros(size(fc8,1),batch);%numclass*batch for b=1:batch; output(:,b)= exp(fc8(:,b))/sum(exp(fc8(:,b))); end end
正向傳播過程中用到了卷積、最大池化降取樣、區域性歸一化、ReLU啟用函式四個函式。下面逐一介紹。
卷積函式:
難點在於2D多通道卷積的理解問題,可以參考連線3.
此處給出卷積函式程式碼:(卷積層最後直接跟著Relu函式)
function [ output_args ] = conv( input_args, kernal,bias,padding, stridew, strideh )
%MYCONV 此處顯示有關此函式的摘要
%括號內容表示維數
%input_arg 輸入資料(height*width*channel*batch)
%kernal 卷積核 (kernalheight*kernalwidth*channel*num)
%bias 偏重(1l*outputnum)
%padding 原影象補零圈數
%stridew 卷積橫向步長
%strideh 卷積縱向步長
%
heightk = size(kernal,1);
widthk = size(kernal,2);
channelk = size(kernal,3);
numk = size(kernal,4);
widthin=size(input_args,2);
heightin=size(input_args,1);
channel = size(input_args,3);
batch = size(input_args,4);
widthout = (widthin+2*padding-widthk)/stridew+1;
heightout = (heightin+2*padding-heightk)/strideh+1;
if channelk~=channel
fprintf('kernalchannel~=channel');
end
%補零
inputz = zeros(widthin+2*padding,widthin+2*padding,channel,batch);
inputz(padding+1:padding+heightin,padding+1:padding+widthin,:,:)=input_args;
output_args = zeros(heightout,widthout,numk,batch);
for b = 1:batch
for d = 1:numk
for i=1:heightout
for j=1:widthout
for n = 1:channel
output_args(i,j,d,b) = output_args(i,j,d,b)+conv2(rot90(inputz( (i-1)*strideh+1 : (i-1)*strideh+heightk , (j-1)*stridew+1 : (j-1)*stridew+widthk ,n,b),2),kernal(:,:,n,d),'valid');
end%rot90將影象旋轉180°,原因:https://www.cnblogs.com/zf-blog/p/8638664.html
end
end
output_args(:,:,d,b) = output_args(:,:,d,b)+bias(d);%jia pianzhi
end
end
output_args = ReLU(output_args);
end
區域性響應歸一化lrn(Local Response Normalization):
據說這個方法已經很少有人用了,但為了還原alexnet,還是實現了一下。
function [ output_args ] = local_response_norm( input_args,depth_radius,bias,alpha,beta )
%LOCAL_RESPONSE_NORM 區域性響應歸一化
%對每個點計算其在通道方向的區域性歸一化值,區域性大小為depth_radius
% 此處顯示詳細說明
%input_arg 輸入資料(height*width*channel*batch)
widthin=size(input_args,2);
heightin=size(input_args,1);
channel = size(input_args,3);
batch = size(input_args,4);
output_args = zeros(heightin,widthin,channel,batch);
for n = 1:channel
sumbegin = max(1,n-depth_radius/2);
sumend = min(channel,n+depth_radius/2);
for b = 1:batch
for i=1:heightin
for j=1:widthin
sqr_sum=sum(input_args(i,j,sumbegin:sumend,b).^2);
output_args(i,j,n,b)=input_args(i,j,n,b)/(bias+alpha*sqr_sum)^beta;
end
end
end
end
end
maxpooling函式:
原理比較簡單,就直接給程式碼吧:
function [ output_args ] = max_pool( input_args,poolsizewidth,poolsizeheight,stridew,strideh )
%MYPOOLING 此處顯示有關此函式的摘要
% 此處顯示詳細說明
%pools為一個可被inputsize整除的常數。
%input_arg 輸入資料(height*width*channel*batch)
widthin = size(input_args,2);
heightin = size(input_args,1);
deepin = size(input_args,3);
batchin = size(input_args,4);
widthout=(widthin-poolsizewidth)/stridew+1;
heightout = (heightin-poolsizeheight)/strideh+1;
output_args = zeros(heightout,widthout,deepin,batchin);
for b = batchin
for d = 1:deepin
for i = 1:heightout
for j = 1:widthout
output_args(i,j,d,b)=max(max(input_args( (i-1)*strideh+1 : (i-1)*strideh+poolsizeheight , (j-1)*stridew+1 : (j-1)*stridew+poolsizewidth , d,b )));
end
end
end
end
end
全連線:
function [ output_args ] = fc( input_args,kernal,bias )
%FC 此處顯示有關此函式的摘要
% kernal 全卷積引數 size(kernal)=inputsize*outputsize,outputsize為神經元個數
% bias 偏置, size(bias)=outputsize
output_args=zeros(size(kernal,1),b);
for b=1:batch
output_args(:,b)=kernal'*input_args(:,b)+bias;
end
output_args=ReLU(output_args);
end
整個過程是理解卷積神經網路底層演算法原理的過程,希望大家可以自己從原理的角度出發,進行matlab程式設計。
以上程式碼都是我根據網上的一些資料理解後自己寫的,可能有理解不周寫的不對的地方,歡迎指正!
也歡迎交流關於提高程式碼執行效率方面的問題!
好啦,Alexnet的正向傳播過程到這裡就結束了。下一篇部落格會給出反向傳播過程的程式碼。
在這裡,要感謝大牛們分享的alexnet網路分析。
參考:
1.Alexnet各層作用、原理和計算方法,及各層卷積核pooling核大小的構建 :https://blog.csdn.net/chaipp0607/article/details/72847422
2.Alexnet各層引數數量:https://vimsky.com/article/3664.html
3.多通道卷積運算:https://blog.csdn.net/yudiemiaomiao/article/details/72466402
4.區域性響應歸一化lrn:https://blog.csdn.net/yangdashi888/article/details/77918311