1. 程式人生 > >Alexnet 卷積神經網路 純 matlab程式碼 底層實現(一)正向傳播

Alexnet 卷積神經網路 純 matlab程式碼 底層實現(一)正向傳播

本程式的主要目的是通過自主編寫alexnet網路,深入瞭解卷積、pooling、區域性區域正則化、全連線層、基於動量的隨機梯度下降、卷積神經網路的引數修正演算法等神經網路的基礎知識。

寫這個部落格一方面幫助自己理清思路,另一方面督促自己加快進度,也希望可以幫助到同樣想深入瞭解卷積神經網路運算機制的朋友們。

根據前人們對Alexnet深入淺出的分析,基本可以完整還原整個網路結構。但還是遇到了一個問題:第一層卷積結果厚度是96,為什麼到第二層時卷積核的厚度是48呢?

由於GPU視訊記憶體資源的限制,Alexnet在卷積的過程中講卷積核分成兩組,由兩個GPU進行運算。96 是兩塊GPU中卷積核數量的總和,其中每塊GPU卷積核的數量都是48。所以輸出的資料是兩個厚度為48的tensor。由此得出結論:

分組對卷積核引數數量有很大影響。例如 第一層到第二層的過程,第一層有96個的卷積核,分成兩組後,每組有48個卷積核,經過pooling層後,輸出兩個27*27*28大小的tensor,此時第二層的卷積核厚度為48,個數為128+128,第二層卷積引數數量為(5*5*48*128+128)*2=307456。如果不分組計算,輸出的第一層輸出的tensor大小為27*27*96,第二層參的卷積核厚度為96,個數為256,第二層卷積引數數量為5*5*96*256+256=614656.數量幾乎增加1倍。

這裡給出Alexnet結構程式碼,其中kernalcell中存放卷積核引數,bias存放偏置。


function [ output,kernalcell,biasout,Edeltak,Edeltab ] = myalexnet( X,label,kernalcell,bias,Edeltak,Edeltab )

    %1st Layer: Conv (w ReLu) -> Lrn -> Pool
    conv1_1 = conv(X, kernalcell{1}{1},bias{1}{1},0,4, 4);
    conv1_2 = conv(X, kernalcell{1}{2}, bias{1}{2},0,4, 4);%55*55*48
    norm1_1 = local_response_norm(conv1_1, 2,1, 2e-05, 0.75);
    norm1_2 = local_response_norm(conv1_2, 2,1, 2e-05, 0.75);
    pool1_1 = max_pool(norm1_1, 3, 3, 2, 2);
    pool1_2 = max_pool(norm1_2, 3, 3, 2, 2);%27*27*48
    %2nd Layer: Conv (w ReLu)  -> Lrn -> Pool 
    conv2_1 = conv(pool1_1, kernalcell{2}{1}, bias{2}{1}, 2, 1, 1);
    conv2_2 = conv(pool1_2, kernalcell{2}{2}, bias{2}{2}, 2, 1, 1);%27*27*128
    conv2 = cat(3,conv2_1,conv2_2);
    norm2 = local_response_norm(conv2, 2,1, 2e-05, 0.75);
    pool2 = max_pool(norm2, 3, 3, 2, 2);%13*13*256*b
    %3rd Layer: Conv (w ReLu)
    conv3_1 = conv(pool2, kernalcell{3}{1}, bias{3}{1},1,1, 1);
    conv3_2 = conv(pool2, kernalcell{3}{2},bias{3}{2},1,1, 1);%13*13*192*b
    %4th Layer: Conv (w ReLu) splitted into two groups
    conv4_1 = conv(conv3_1, kernalcell{4}{1},bias{4}{1},1, 1, 1);
    conv4_2 = conv(conv3_2, kernalcell{4}{2},bias{4}{2},1, 1, 1);%13*13*192*b
    %5th Layer: Conv (w ReLu) -> Pool splitted into two groups
    conv5_1 = conv(conv4_1, kernalcell{5}{1},bias{5}{1}, 1,1, 1);
    conv5_2 = conv(conv4_2, kernalcell{5}{2},bias{5}{2}, 1,1, 1); %輸出為13*13*128*batch
    conv5 = cat(3,conv5_1,conv5_2);%13*13*256*b
    pool5 = max_pool(conv5, 3, 3, 2, 2);%6*6*256*batch
   % 6th Layer: Flatten -> FC (w ReLu) -> Dropout
   size_pool5.a=size(pool5,1);
   size_pool5.b=size(pool5,2);
   size_pool5.c=size(pool5,3);
	pool5=reshape(pool5,1,size_pool5.a*size_pool5.b*size_pool5.c,batch);%batch=size(pool5,4) pool5:9216*batch
    fc6 = fc(pool5, kernalcell{6},bias{6});%1*4096*b
    %dropout6=dropout(conv6);
    %7th Layer: FC (w ReLu) -> Dropout
    %conv6=rand(1,1,4096);
    fc7 = fc(fc6,kernalcell{7},bias{7}); %1*4096*batch
  %  dropout7 = dropout(fc7);

    %8th Layer: FC and return unscaled activations
    fc8 = fc(fc7, kernalcell{8},bias{8});
    %softmax Layer
    output=zeros(size(fc8,1),batch);%numclass*batch
    for b=1:batch;
        output(:,b)= exp(fc8(:,b))/sum(exp(fc8(:,b)));
    end

end

正向傳播過程中用到了卷積、最大池化降取樣、區域性歸一化、ReLU啟用函式四個函式。下面逐一介紹。

卷積函式:

難點在於2D多通道卷積的理解問題,可以參考連線3.

此處給出卷積函式程式碼:(卷積層最後直接跟著Relu函式)


function [ output_args ] = conv( input_args, kernal,bias,padding, stridew, strideh )
%MYCONV 此處顯示有關此函式的摘要
%括號內容表示維數
%input_arg 輸入資料(height*width*channel*batch)
%kernal 卷積核 (kernalheight*kernalwidth*channel*num)
%bias 偏重(1l*outputnum)
%padding  原影象補零圈數
%stridew 卷積橫向步長
%strideh 卷積縱向步長
%
    heightk = size(kernal,1);
    widthk = size(kernal,2);
    channelk = size(kernal,3);
    numk = size(kernal,4);
    
    widthin=size(input_args,2);
    heightin=size(input_args,1);
    channel = size(input_args,3);
    batch = size(input_args,4);
    widthout = (widthin+2*padding-widthk)/stridew+1;
    heightout = (heightin+2*padding-heightk)/strideh+1;
    if channelk~=channel
        fprintf('kernalchannel~=channel');
    end
    %補零
    inputz = zeros(widthin+2*padding,widthin+2*padding,channel,batch);
    inputz(padding+1:padding+heightin,padding+1:padding+widthin,:,:)=input_args;
    
    output_args = zeros(heightout,widthout,numk,batch);
    for b = 1:batch
        for d = 1:numk
            for i=1:heightout
                for j=1:widthout
                    for n = 1:channel
                        output_args(i,j,d,b) = output_args(i,j,d,b)+conv2(rot90(inputz( (i-1)*strideh+1 : (i-1)*strideh+heightk , (j-1)*stridew+1 : (j-1)*stridew+widthk ,n,b),2),kernal(:,:,n,d),'valid');  
                    end%rot90將影象旋轉180°,原因:https://www.cnblogs.com/zf-blog/p/8638664.html
                end
            end
            output_args(:,:,d,b) = output_args(:,:,d,b)+bias(d);%jia pianzhi
        end
    end 
    
    output_args = ReLU(output_args);
   

end

區域性響應歸一化lrn(Local Response Normalization):

據說這個方法已經很少有人用了,但為了還原alexnet,還是實現了一下。


function [ output_args ] = local_response_norm( input_args,depth_radius,bias,alpha,beta )
%LOCAL_RESPONSE_NORM 區域性響應歸一化
%對每個點計算其在通道方向的區域性歸一化值,區域性大小為depth_radius
%   此處顯示詳細說明
%input_arg 輸入資料(height*width*channel*batch)
    widthin=size(input_args,2);
    heightin=size(input_args,1);
    channel = size(input_args,3);
     batch = size(input_args,4);
    output_args = zeros(heightin,widthin,channel,batch);
    for n = 1:channel
        sumbegin = max(1,n-depth_radius/2);
        sumend = min(channel,n+depth_radius/2);
        for b = 1:batch
            for i=1:heightin
                for j=1:widthin
                    sqr_sum=sum(input_args(i,j,sumbegin:sumend,b).^2);
                    output_args(i,j,n,b)=input_args(i,j,n,b)/(bias+alpha*sqr_sum)^beta;
                end
            end
        end
    end

end

maxpooling函式:

原理比較簡單,就直接給程式碼吧:


function [ output_args ] = max_pool( input_args,poolsizewidth,poolsizeheight,stridew,strideh )
%MYPOOLING 此處顯示有關此函式的摘要
%   此處顯示詳細說明
%pools為一個可被inputsize整除的常數。
%input_arg 輸入資料(height*width*channel*batch)
    widthin = size(input_args,2);
    heightin = size(input_args,1);
    deepin = size(input_args,3);
    batchin = size(input_args,4);
    widthout=(widthin-poolsizewidth)/stridew+1;
    heightout = (heightin-poolsizeheight)/strideh+1;
    output_args = zeros(heightout,widthout,deepin,batchin);
    for b = batchin
        for d = 1:deepin
            for i = 1:heightout
                for j = 1:widthout
                output_args(i,j,d,b)=max(max(input_args( (i-1)*strideh+1 : (i-1)*strideh+poolsizeheight , (j-1)*stridew+1 : (j-1)*stridew+poolsizewidth , d,b )));        
                end
            end
        end
    end
end

全連線:


function [ output_args ] = fc( input_args,kernal,bias )
%FC 此處顯示有關此函式的摘要
%   kernal 全卷積引數 size(kernal)=inputsize*outputsize,outputsize為神經元個數
% bias 偏置, size(bias)=outputsize
    output_args=zeros(size(kernal,1),b);
    for b=1:batch
        output_args(:,b)=kernal'*input_args(:,b)+bias;
    end
    output_args=ReLU(output_args);
end

整個過程是理解卷積神經網路底層演算法原理的過程,希望大家可以自己從原理的角度出發,進行matlab程式設計。

以上程式碼都是我根據網上的一些資料理解後自己寫的,可能有理解不周寫的不對的地方,歡迎指正!

也歡迎交流關於提高程式碼執行效率方面的問題!

好啦,Alexnet的正向傳播過程到這裡就結束了。下一篇部落格會給出反向傳播過程的程式碼。

在這裡,要感謝大牛們分享的alexnet網路分析。

參考:

1.Alexnet各層作用、原理和計算方法,及各層卷積核pooling核大小的構建 :https://blog.csdn.net/chaipp0607/article/details/72847422

2.Alexnet各層引數數量:https://vimsky.com/article/3664.html

3.多通道卷積運算:https://blog.csdn.net/yudiemiaomiao/article/details/72466402

4.區域性響應歸一化lrn:https://blog.csdn.net/yangdashi888/article/details/77918311