Alexnet 卷積神經網路純 matlab程式碼底層實現（一）正向傳播

阿新 • • 發佈：2019-01-10

本程式的主要目的是通過自主編寫alexnet網路，深入瞭解卷積、pooling、區域性區域正則化、全連線層、基於動量的隨機梯度下降、卷積神經網路的引數修正演算法等神經網路的基礎知識。

寫這個部落格一方面幫助自己理清思路，另一方面督促自己加快進度，也希望可以幫助到同樣想深入瞭解卷積神經網路運算機制的朋友們。

根據前人們對Alexnet深入淺出的分析，基本可以完整還原整個網路結構。但還是遇到了一個問題：第一層卷積結果厚度是96，為什麼到第二層時卷積核的厚度是48呢？

由於GPU視訊記憶體資源的限制，Alexnet在卷積的過程中講卷積核分成兩組，由兩個GPU進行運算。96 是兩塊GPU中卷積核數量的總和，其中每塊GPU卷積核的數量都是48。所以輸出的資料是兩個厚度為48的tensor。由此得出結論：

分組對卷積核引數數量有很大影響。例如第一層到第二層的過程，第一層有96個的卷積核，分成兩組後，每組有48個卷積核，經過pooling層後，輸出兩個27*27*28大小的tensor，此時第二層的卷積核厚度為48，個數為128+128，第二層卷積引數數量為（5*5*48*128+128）*2=307456。如果不分組計算，輸出的第一層輸出的tensor大小為27*27*96，第二層參的卷積核厚度為96，個數為256，第二層卷積引數數量為5*5*96*256+256=614656.數量幾乎增加1倍。

這裡給出Alexnet結構程式碼，其中kernalcell中存放卷積核引數，bias存放偏置。


function [ output,kernalcell,biasout,Edeltak,Edeltab ] = myalexnet( X,label,kernalcell,bias,Edeltak,Edeltab )

    %1st Layer: Conv (w ReLu) -> Lrn -> Pool
    conv1_1 = conv(X, kernalcell{1}{1},bias{1}{1},0,4, 4);
    conv1_2 = conv(X, kernalcell{1}{2}, bias{1}{2},0,4, 4);%55*55*48
    norm1_1 = local_response_norm(conv1_1, 2,1, 2e-05, 0.75);
    norm1_2 = local_response_norm(conv1_2, 2,1, 2e-05, 0.75);
    pool1_1 = max_pool(norm1_1, 3, 3, 2, 2);
    pool1_2 = max_pool(norm1_2, 3, 3, 2, 2);%27*27*48
    %2nd Layer: Conv (w ReLu)  -> Lrn -> Pool 
    conv2_1 = conv(pool1_1, kernalcell{2}{1}, bias{2}{1}, 2, 1, 1);
    conv2_2 = conv(pool1_2, kernalcell{2}{2}, bias{2}{2}, 2, 1, 1);%27*27*128
    conv2 = cat(3,conv2_1,conv2_2);
    norm2 = local_response_norm(conv2, 2,1, 2e-05, 0.75);
    pool2 = max_pool(norm2, 3, 3, 2, 2);%13*13*256*b
    %3rd Layer: Conv (w ReLu)
    conv3_1 = conv(pool2, kernalcell{3}{1}, bias{3}{1},1,1, 1);
    conv3_2 = conv(pool2, kernalcell{3}{2},bias{3}{2},1,1, 1);%13*13*192*b
    %4th Layer: Conv (w ReLu) splitted into two groups
    conv4_1 = conv(conv3_1, kernalcell{4}{1},bias{4}{1},1, 1, 1);
    conv4_2 = conv(conv3_2, kernalcell{4}{2},bias{4}{2},1, 1, 1);%13*13*192*b
    %5th Layer: Conv (w ReLu) -> Pool splitted into two groups
    conv5_1 = conv(conv4_1, kernalcell{5}{1},bias{5}{1}, 1,1, 1);
    conv5_2 = conv(conv4_2, kernalcell{5}{2},bias{5}{2}, 1,1, 1); %輸出為13*13*128*batch
    conv5 = cat(3,conv5_1,conv5_2);%13*13*256*b
    pool5 = max_pool(conv5, 3, 3, 2, 2);%6*6*256*batch
   % 6th Layer: Flatten -> FC (w ReLu) -> Dropout
   size_pool5.a=size(pool5,1);
   size_pool5.b=size(pool5,2);
   size_pool5.c=size(pool5,3);
	pool5=reshape(pool5,1,size_pool5.a*size_pool5.b*size_pool5.c,batch);%batch=size(pool5,4) pool5:9216*batch
    fc6 = fc(pool5, kernalcell{6},bias{6});%1*4096*b
    %dropout6=dropout(conv6);
    %7th Layer: FC (w ReLu) -> Dropout
    %conv6=rand(1,1,4096);
    fc7 = fc(fc6,kernalcell{7},bias{7}); %1*4096*batch
  %  dropout7 = dropout(fc7);

    %8th Layer: FC and return unscaled activations
    fc8 = fc(fc7, kernalcell{8},bias{8});
    %softmax Layer
    output=zeros(size(fc8,1),batch);%numclass*batch
    for b=1:batch;
        output(:,b)= exp(fc8(:,b))/sum(exp(fc8(:,b)));
    end

end

正向傳播過程中用到了卷積、最大池化降取樣、區域性歸一化、ReLU啟用函式四個函式。下面逐一介紹。

卷積函式：

難點在於2D多通道卷積的理解問題，可以參考連線3.

此處給出卷積函式程式碼：（卷積層最後直接跟著Relu函式）


function [ output_args ] = conv( input_args, kernal,bias,padding, stridew, strideh )
%MYCONV 此處顯示有關此函式的摘要
%括號內容表示維數
%input_arg 輸入資料(height*width*channel*batch)
%kernal 卷積核 (kernalheight*kernalwidth*channel*num)
%bias 偏重(1l*outputnum)
%padding  原影象補零圈數
%stridew 卷積橫向步長
%strideh 卷積縱向步長
%
    heightk = size(kernal,1);
    widthk = size(kernal,2);
    channelk = size(kernal,3);
    numk = size(kernal,4);
    
    widthin=size(input_args,2);
    heightin=size(input_args,1);
    channel = size(input_args,3);
    batch = size(input_args,4);
    widthout = (widthin+2*padding-widthk)/stridew+1;
    heightout = (heightin+2*padding-heightk)/strideh+1;
    if channelk~=channel
        fprintf('kernalchannel~=channel');
    end
    %補零
    inputz = zeros(widthin+2*padding,widthin+2*padding,channel,batch);
    inputz(padding+1:padding+heightin,padding+1:padding+widthin,:,:)=input_args;
    
    output_args = zeros(heightout,widthout,numk,batch);
    for b = 1:batch
        for d = 1:numk
            for i=1:heightout
                for j=1:widthout
                    for n = 1:channel
                        output_args(i,j,d,b) = output_args(i,j,d,b)+conv2(rot90(inputz( (i-1)*strideh+1 : (i-1)*strideh+heightk , (j-1)*stridew+1 : (j-1)*stridew+widthk ,n,b),2),kernal(:,:,n,d),'valid');  
                    end%rot90將影象旋轉180°，原因：https://www.cnblogs.com/zf-blog/p/8638664.html
                end
            end
            output_args(:,:,d,b) = output_args(:,:,d,b)+bias(d);%jia pianzhi
        end
    end 
    
    output_args = ReLU(output_args);
   

end

區域性響應歸一化lrn（Local Response Normalization）:

據說這個方法已經很少有人用了，但為了還原alexnet，還是實現了一下。


function [ output_args ] = local_response_norm( input_args,depth_radius,bias,alpha,beta )
%LOCAL_RESPONSE_NORM 區域性響應歸一化
%對每個點計算其在通道方向的區域性歸一化值，區域性大小為depth_radius
%   此處顯示詳細說明
%input_arg 輸入資料(height*width*channel*batch)
    widthin=size(input_args,2);
    heightin=size(input_args,1);
    channel = size(input_args,3);
     batch = size(input_args,4);
    output_args = zeros(heightin,widthin,channel,batch);
    for n = 1:channel
        sumbegin = max(1,n-depth_radius/2);
        sumend = min(channel,n+depth_radius/2);
        for b = 1:batch
            for i=1:heightin
                for j=1:widthin
                    sqr_sum=sum(input_args(i,j,sumbegin:sumend,b).^2);
                    output_args(i,j,n,b)=input_args(i,j,n,b)/(bias+alpha*sqr_sum)^beta;
                end
            end
        end
    end

end

maxpooling函式：

原理比較簡單，就直接給程式碼吧：


function [ output_args ] = max_pool( input_args,poolsizewidth,poolsizeheight,stridew,strideh )
%MYPOOLING 此處顯示有關此函式的摘要
%   此處顯示詳細說明
%pools為一個可被inputsize整除的常數。
%input_arg 輸入資料(height*width*channel*batch)
    widthin = size(input_args,2);
    heightin = size(input_args,1);
    deepin = size(input_args,3);
    batchin = size(input_args,4);
    widthout=(widthin-poolsizewidth)/stridew+1;
    heightout = (heightin-poolsizeheight)/strideh+1;
    output_args = zeros(heightout,widthout,deepin,batchin);
    for b = batchin
        for d = 1:deepin
            for i = 1:heightout
                for j = 1:widthout
                output_args(i,j,d,b)=max(max(input_args( (i-1)*strideh+1 : (i-1)*strideh+poolsizeheight , (j-1)*stridew+1 : (j-1)*stridew+poolsizewidth , d,b )));        
                end
            end
        end
    end
end

全連線：


function [ output_args ] = fc( input_args,kernal,bias )
%FC 此處顯示有關此函式的摘要
%   kernal 全卷積引數 size(kernal)=inputsize*outputsize，outputsize為神經元個數
% bias 偏置, size(bias)=outputsize
    output_args=zeros(size(kernal,1),b);
    for b=1:batch
        output_args(:,b)=kernal'*input_args(:,b)+bias;
    end
    output_args=ReLU(output_args);
end

整個過程是理解卷積神經網路底層演算法原理的過程，希望大家可以自己從原理的角度出發，進行matlab程式設計。

以上程式碼都是我根據網上的一些資料理解後自己寫的，可能有理解不周寫的不對的地方，歡迎指正！

也歡迎交流關於提高程式碼執行效率方面的問題！

好啦，Alexnet的正向傳播過程到這裡就結束了。下一篇部落格會給出反向傳播過程的程式碼。

在這裡，要感謝大牛們分享的alexnet網路分析。

參考：

1.Alexnet各層作用、原理和計算方法，及各層卷積核pooling核大小的構建：https://blog.csdn.net/chaipp0607/article/details/72847422

2.Alexnet各層引數數量：https://vimsky.com/article/3664.html

3.多通道卷積運算：https://blog.csdn.net/yudiemiaomiao/article/details/72466402

4.區域性響應歸一化lrn：https://blog.csdn.net/yangdashi888/article/details/77918311

Alexnet 卷積神經網路純 matlab程式碼底層實現（一）正向傳播

Alexnet 卷積神經網路純 matlab程式碼底層實現（一）正向傳播

巡禮卷積神經網路中的那些經典結構（一）—— Group convolution

巡禮卷積神經網路中的那些經典結構（二）——inception module

DeepLearning tutorial（4）CNN卷積神經網路原理簡介+程式碼詳解

《TensorFlow實戰》中AlexNet卷積神經網路的訓練中

lesson26-27卷積神經網路，lenet5程式碼講解

全卷積神經網路FCN-TensorFlow程式碼精析

Tensorflow實戰7：實現AlexNet卷積神經網路及運算時間評測

CNN卷積神經網路原理簡介+程式碼詳解

用卷積神經網路和自注意力機制實現QANet（問答網路）

官方卷積神經網路cifar10原始碼的學習筆記（多塊GPU）

【深度學習】8：CNN卷積神經網路與sklearn資料集實現數字識別

《TensorFlow學習筆記》卷積神經網路CNN實戰-cifar10資料集（tensorboard視覺化）

深入理解卷積神經網路(CNN)——從原理到實現

基於神經網路的驗證碼實驗研究（一）

深度學習概述-神經網路與深度學習學習筆記（一）

Matlab程式設計之——卷積神經網路CNN程式碼解析

卷積神經網路matlab 程式碼理解

機器學習與深度學習系列連載：第二部分深度學習（十二）卷積神經網路 3 經典的模型（LeNet-5，AlexNet ，VGGNet，GoogLeNet，ResNet）

深度學習方法（五）：卷積神經網路CNN經典模型整理Lenet，Alexnet，Googlenet，VGG，Deep Residual Learning

Alexnet 卷積神經網路 純 matlab程式碼 底層實現（一）正向傳播

相關推薦

Alexnet 卷積神經網路純 matlab程式碼底層實現（一）正向傳播