網路結構之 Inception V3

1. 卷積網路結構的設計原則(principle)

[1] - 避免特徵表示的瓶頸(representational bottleneck)，尤其是網路淺層結構. 前饋網路可以採用由輸入層到分類器或迴歸器的無環圖(acyclic graph) 來表示，其定義了資訊流的傳遞方向. 特徵表示瓶頸(representational bottleneck) 是指網路中間層會對特徵的維度進行較大的壓縮(pooling 等操作)，從輸入到輸出特徵的尺寸明顯減少，出現特徵丟失. 理論上，由於特徵丟失的問題，比如特徵的關聯性結構，資訊內容不能僅僅由輸出的特徵表示. 只是提供了粗略的特徵估計. 優化網路結構，減少 Pooling 等導致的特徵丟失.
[2] - 高維的特徵表示，易於網路的收斂. 在卷積網路中增加每個區塊的啟用允許更多解開特徵(disentangled features). 網路訓練更快. 輸入資訊被分解，各子特徵間的關聯性低，子特徵內部的相關性強. 這樣，將強相關性的特徵聚合，更易於網路的收斂.
[3] - 低維特徵的空間聚合，不會導致特徵表示能力的丟失. 例如，在進行更加分散(如，3x3)的卷積前，可以在空間聚合(spatial aggregation)前，可以採用 1x1 卷積核降低輸入特徵的維度. 其原因，推測為：如果在空間聚合內容中使用輸出，臨近神經元的強關聯性，在降維時不會出現較多的資訊損失.
[4] - 平衡網路的寬度和深度.

網路寬度和深度的平衡，才能達到最有的網路效能.

2. 分解大尺寸核的卷積

Factorizing Convolutions with Large Filter Size

2.1 大尺寸卷積核分解為小尺寸卷積核

Factorization into smaller convolutions

一個 5x5 卷積核等價於兩個連續的 3x3 卷積核. 假設 5x5 和兩個連續的 3x3卷積輸出的特徵數相同，則計算量之比為：5x5/(3x3+3x3)=25/18

2.2 非對稱卷積的空間分解

Spatial Factorization into Asymmetric Convolutions

一個 3x3卷積核等價於兩個 3x1 卷積核. 將 7x7 卷積核分解成兩個卷積核(1x7, 7x1)，3x3 卷積核分解為 (1x3, 3x1). 既可以加速計算，又通過將 1 個卷積層分解為 2 個卷積層，加深了網路深度，提高網路的非線性.

3. 有效減少網路尺寸

Efficient Grid Size Reduction

一般情況下，CNN 網路會採用 pooling 操作降低 feature maps 的網格尺寸. 為了避免出現特徵表示瓶頸(representational boottleneck)問題，在採用 max 或 ave pooling 操作前，將網路 filters 的啟用維度進行擴充套件(expanded).

例如：對於 k filters 的 dxd 網格，如果要得到 2k filters 的 (d/2)x(d/2) 網格，則，首先需要計算 2k filters 的 stride-1 卷積，然後再進行 pooling 操作. 總的計算量主要是在較大的網格上進行的 2dxdxkxk 次卷積操作.

一種可行的替代方式是，先 pooling 再卷積，則計算量為2(d/2)x(d/2)xkxk，將計算量減少到 1/4. 但，會出現特徵表示瓶頸問題，特徵表示的維度降低為 (d/2)x(d/2)xk，導致網路表徵能力不夠好(如圖 Figure 9).

這裡採用的方式，如 Figure 10，既去除了特徵表示瓶頸問題，同時減少了計算量. 採用兩個並行的步長 stride-2 操作.

4. Inception V3 網路結構.

採用 Figure 10 中的方法降低不同 Inception 模組間的網格尺寸. 採用 0-padding 的卷積，保持網格尺寸. 在 Inception 模組內部，也會採用 0-padding 的卷積來保持網格尺寸.

5. Tensorflow Slim 的 Inception V3 定義

"""
Inception V3 分類網路定義.
"""

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import tensorflow as tf

from nets import inception_utils

slim = tf.contrib.slim
trunc_normal = lambda stddev: tf.truncated_normal_initializer(0.0, stddev)


def inception_v3_base(inputs,
									final_endpoint='Mixed_7c',
									min_depth=16,
									depth_multiplier=1.0,
									scope=None):
  """
  Inception V3 基礎網路結構定義.
  
  根據給定的輸入和最終網路節點構建 Inception V3 網路. 
  可以構建表格中從輸入到 inception 模組 Mixed_7c 的網路結構.
  
  注：網路層的名字與論文裡的不對應，但，構建的網路相同.
  
  old_names 到 new names 的對映:
  Old name          | New name
  =======================================
  conv0             | Conv2d_1a_3x3
  conv1             | Conv2d_2a_3x3
  conv2             | Conv2d_2b_3x3
  pool1             | MaxPool_3a_3x3
  conv3             | Conv2d_3b_1x1
  conv4             | Conv2d_4a_3x3
  pool2             | MaxPool_5a_3x3
  mixed_35x35x256a  | Mixed_5b
  mixed_35x35x288a  | Mixed_5c
  mixed_35x35x288b  | Mixed_5d
  mixed_17x17x768a  | Mixed_6a
  mixed_17x17x768b  | Mixed_6b
  mixed_17x17x768c  | Mixed_6c
  mixed_17x17x768d  | Mixed_6d
  mixed_17x17x768e  | Mixed_6e
  mixed_8x8x1280a   | Mixed_7a
  mixed_8x8x2048a   | Mixed_7b
  mixed_8x8x2048b   | Mixed_7c

  Args:
    inputs: Tensor，尺寸為 [batch_size, height, width, channels].
    final_endpoint: 指定網路定義結束的節點endpoint，即網路深度.
                            候選值：['Conv2d_1a_3x3', 'Conv2d_2a_3x3', 'Conv2d_2b_3x3',
							'MaxPool_3a_3x3', 'Conv2d_3b_1x1', 'Conv2d_4a_3x3', 
							'MaxPool_5a_3x3', 'Mixed_5b', 'Mixed_5c', 'Mixed_5d', 
							'Mixed_6a', 'Mixed_6b', 'Mixed_6c', 'Mixed_6d', 'Mixed_6e', 
							'Mixed_7a', 'Mixed_7b', 'Mixed_7c'].
    min_depth: 所有卷積 ops 的最小深度值(通道數，depth value (number of channels)).
                      當 depth_multiplier < 1 時，強制執行；
                      當 depth_multiplier >= 1 時，不是主動約束項.
    depth_multiplier: 所有卷積 ops 深度(depth (number of channels))的浮點數乘子.
                                該值必須大於 0.
                                一般是將該值設為 (0, 1) 間的浮點數值，以減少引數量或模型的計算量.
    scope: 可選變數作用域 variable_scope.

  Returns:
    tensor_out: 對應到網路最終節點final_endpoint 的輸出張量Tensor.
    end_points: 外部使用的啟用值集合，例如，summaries 和 losses.

  Raises:
    ValueError: if final_endpoint is not set to one of the predefined values,
                or depth_multiplier <= 0
  """
  # end_points 儲存相關外用的啟用值，例如 summaries 或 losses.
  end_points = {}

  if depth_multiplier <= 0:
    raise ValueError('depth_multiplier is not greater than zero.')
  depth = lambda d: max(int(d * depth_multiplier), min_depth)

  with tf.variable_scope(scope, 'InceptionV3', [inputs]):
    with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d], stride=1, padding='VALID'):
      # 299 x 299 x 3
      end_point = 'Conv2d_1a_3x3'
      net = slim.conv2d(inputs, depth(32), [3, 3], stride=2, scope=end_point)
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points
      # 149 x 149 x 32
      end_point = 'Conv2d_2a_3x3'
      net = slim.conv2d(net, depth(32), [3, 3], scope=end_point)
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points
      # 147 x 147 x 32
      end_point = 'Conv2d_2b_3x3'
      net = slim.conv2d(net, depth(64), [3, 3], padding='SAME', scope=end_point)
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points
      # 147 x 147 x 64
      end_point = 'MaxPool_3a_3x3'
      net = slim.max_pool2d(net, [3, 3], stride=2, scope=end_point)
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points
      # 73 x 73 x 64
      end_point = 'Conv2d_3b_1x1'
      net = slim.conv2d(net, depth(80), [1, 1], scope=end_point)
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points
      # 73 x 73 x 80.
      end_point = 'Conv2d_4a_3x3'
      net = slim.conv2d(net, depth(192), [3, 3], scope=end_point)
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points
      # 71 x 71 x 192.
      end_point = 'MaxPool_5a_3x3'
      net = slim.max_pool2d(net, [3, 3], stride=2, scope=end_point)
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points
      # 35 x 35 x 192.

    # Inception blocks
    with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d], stride=1, padding='SAME'):
      # mixed: 35 x 35 x 256.
      end_point = 'Mixed_5b'
      with tf.variable_scope(end_point):
        with tf.variable_scope('Branch_0'):
          branch_0 = slim.conv2d(net, depth(64), [1, 1], scope='Conv2d_0a_1x1')
        with tf.variable_scope('Branch_1'):
          branch_1 = slim.conv2d(net, depth(48), [1, 1], scope='Conv2d_0a_1x1')
          branch_1 = slim.conv2d(branch_1, depth(64), [5, 5], scope='Conv2d_0b_5x5')
        with tf.variable_scope('Branch_2'):
          branch_2 = slim.conv2d(net, depth(64), [1, 1], scope='Conv2d_0a_1x1')
          branch_2 = slim.conv2d(branch_2, depth(96), [3, 3], scope='Conv2d_0b_3x3')
          branch_2 = slim.conv2d(branch_2, depth(96), [3, 3], scope='Conv2d_0c_3x3')
        with tf.variable_scope('Branch_3'):
          branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
          branch_3 = slim.conv2d(branch_3, depth(32), [1, 1], scope='Conv2d_0b_1x1')
        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points

      # mixed_1: 35 x 35 x 288.
      end_point = 'Mixed_5c'
      with tf.variable_scope(end_point):
        with tf.variable_scope('Branch_0'):
          branch_0 = slim.conv2d(net, depth(64), [1, 1], scope='Conv2d_0a_1x1')
        with tf.variable_scope('Branch_1'):
          branch_1 = slim.conv2d(net, depth(48), [1, 1], scope='Conv2d_0b_1x1')
          branch_1 = slim.conv2d(branch_1, depth(64), [5, 5], scope='Conv_1_0c_5x5')
        with tf.variable_scope('Branch_2'):
          branch_2 = slim.conv2d(net, depth(64), [1, 1], scope='Conv2d_0a_1x1')
          branch_2 = slim.conv2d(branch_2, depth(96), [3, 3], scope='Conv2d_0b_3x3')
          branch_2 = slim.conv2d(branch_2, depth(96), [3, 3], scope='Conv2d_0c_3x3')
        with tf.variable_scope('Branch_3'):
          branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
          branch_3 = slim.conv2d(branch_3, depth(64), [1, 1], scope='Conv2d_0b_1x1')
        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points

      # mixed_2: 35 x 35 x 288.
      end_point = 'Mixed_5d'
      with tf.variable_scope(end_point):
        with tf.variable_scope('Branch_0'):
          branch_0 = slim.conv2d(net, depth(64), [1, 1], scope='Conv2d_0a_1x1')
        with tf.variable_scope('Branch_1'):
          branch_1 = slim.conv2d(net, depth(48), [1, 1], scope='Conv2d_0a_1x1')
          branch_1 = slim.conv2d(branch_1, depth(64), [5, 5], scope='Conv2d_0b_5x5')
        with tf.variable_scope('Branch_2'):
          branch_2 = slim.conv2d(net, depth(64), [1, 1], scope='Conv2d_0a_1x1')
          branch_2 = slim.conv2d(branch_2, depth(96), [3, 3], scope='Conv2d_0b_3x3')
          branch_2 = slim.conv2d(branch_2, depth(96), [3, 3], scope='Conv2d_0c_3x3')
        with tf.variable_scope('Branch_3'):
          branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
          branch_3 = slim.conv2d(branch_3, depth(64), [1, 1], scope='Conv2d_0b_1x1')
        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points

      # mixed_3: 17 x 17 x 768.
      end_point = 'Mixed_6a'
      with tf.variable_scope(end_point):
        with tf.variable_scope('Branch_0'):
          branch_0 = slim.conv2d(net, depth(384), [3, 3], stride=2, 
												padding='VALID', scope='Conv2d_1a_1x1')
        with tf.variable_scope('Branch_1'):
          branch_1 = slim.conv2d(net, depth(64), [1, 1], scope='Conv2d_0a_1x1')
          branch_1 = slim.conv2d(branch_1, depth(96), [3, 3], scope='Conv2d_0b_3x3')
          branch_1 = slim.conv2d(branch_1, depth(96), [3, 3], stride=2,
												padding='VALID', scope='Conv2d_1a_1x1')
        with tf.variable_scope('Branch_2'):
          branch_2 = slim.max_pool2d(net, [3, 3], stride=2, 
														padding='VALID', scope='MaxPool_1a_3x3')
        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2])
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points

      # mixed4: 17 x 17 x 768.
      end_point = 'Mixed_6b'
      with tf.variable_scope(end_point):
        with tf.variable_scope('Branch_0'):
          branch_0 = slim.conv2d(net, depth(192), [1, 1], scope='Conv2d_0a_1x1')
        with tf.variable_scope('Branch_1'):
          branch_1 = slim.conv2d(net, depth(128), [1, 1], scope='Conv2d_0a_1x1')
          branch_1 = slim.conv2d(branch_1, depth(128), [1, 7], scope='Conv2d_0b_1x7')
          branch_1 = slim.conv2d(branch_1, depth(192), [7, 1], scope='Conv2d_0c_7x1')
        with tf.variable_scope('Branch_2'):
          branch_2 = slim.conv2d(net, depth(128), [1, 1], scope='Conv2d_0a_1x1')
          branch_2 = slim.conv2d(branch_2, depth(128), [7, 1], scope='Conv2d_0b_7x1')
          branch_2 = slim.conv2d(branch_2, depth(128), [1, 7], scope='Conv2d_0c_1x7')
          branch_2 = slim.conv2d(branch_2, depth(128), [7, 1], scope='Conv2d_0d_7x1')
          branch_2 = slim.conv2d(branch_2, depth(192), [1, 7], scope='Conv2d_0e_1x7')
        with tf.variable_scope('Branch_3'):
          branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
          branch_3 = slim.conv2d(branch_3, depth(192), [1, 1], scope='Conv2d_0b_1x1')
        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points

      # mixed_5: 17 x 17 x 768.
      end_point = 'Mixed_6c'
      with tf.variable_scope(end_point):
        with tf.variable_scope('Branch_0'):
          branch_0 = slim.conv2d(net, depth(192), [1, 1], scope='Conv2d_0a_1x1')
        with tf.variable_scope('Branch_1'):
          branch_1 = slim.conv2d(net, depth(160), [1, 1], scope='Conv2d_0a_1x1')
          branch_1 = slim.conv2d(branch_1, depth(160), [1, 7], scope='Conv2d_0b_1x7')
          branch_1 = slim.conv2d(branch_1 
 
              
           
              
              
            
            相關推薦
			   
            
            
            
 

    

    
    網路結構之 Inception V3
      
							
							
							

1. 卷積網路結構的設計原則(principle)


[1] - 避免特徵表示的瓶頸(representational bottleneck)，尤其是網路淺層結構.
前饋網路可以採用由輸入層到分類器或迴歸器的無環圖(acyclic graph) 來表示，其 

  
 

    

    
    網路結構之 Inception V4
      
							
							
							

Inception V4 網路結構:




Figure 7 中的，k=192, l=224, m=256, n=384

Tensorflow Slim 的 Inception V4 定義

"""
Inception V4 網路結構定義.
"""
fr 

  
 

    

    
    網路結構解讀之inception系列四：Inception V3
      　　Inception V3根據前面兩篇結構的經驗和新設計的結構的實驗，總結了一套可借鑑的網路結構設計的原則。理解這些原則的背後隱藏的動機比單純知道這個操作更有意義。 
　　Rethinking the Inception Architecture for Computer Vision 
 
  主題：如何 

  
 

    

    
    網路結構解讀之inception系列五：Inception V4
      　　在殘差逐漸當道時，google開始研究inception和殘差網路的效能差異以及結合的可能性，並且給出了實驗結構。 
本文思想闡述不多，主要是三個結構的網路和實驗效能對比。 
Inception-v4, Inception-ResNet and  
the Impact of Residual  

  
 

    

    
    GoogLeNet 之 Inception v1 v2 v3 v4
      論文地址 
Inception V1 ：Going Deeper with Convolutions 
Inception-v2 ：Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate 
I 

  
 

    

    
    TensorFlow 之基於Inception V3的多標籤分類 retrain
      
							
							
							



一、準備訓練資料

1.下載資料集 
本文采用南京大學開源的資料集(點選下載:http://lamda.nju.edu.cn/files/miml-image-data.rar) 
資料集中含有2000張影象，5個類，分別為 desert, mounta 

  
 

    

    
    深度學習之基礎模型-Inception-V3
      
							
							
							
  Although increased model size and computational cost tend to translate to immediate quality gains for most tasks (as long as eno 

  
 

    

    
    1、VGG16 2、VGG19 3、ResNet50 4、Inception V3 5、Xception介紹——遷移學習
      1-1   算法   應用   tle   你在   mode   很多   簡單回顧   gis   ResNet, AlexNet, VGG, Inception: 理解各種各樣的CNN架構
本文翻譯自ResNet, AlexNet, VGG, Inception: Understanding vario 

  
 

    

    
    深度解讀GoogleNet之Inception V1
      能力   翻轉   浪費   對齊   並行運算   bubuko   AD   好的   減少   GoogleNet設計的目的
GoogleNet設計的初衷是為了提高在網絡裏面的計算資源的利用率。
Motivation
網絡越大,意味著網絡的參數較多,尤其當數據集很小的時候,網絡更容易發生過擬合。網絡越大 

  
 

    

    
    8-3下載inception-v3時遇到的問題
      img   flow   com   mage   nbsp   model   語句   AI   gen   







 
解決辦法：
1.手動下載inception-v3：
http://download.tensorflow.org/models/image/imagenet/inception 

  
 

    

    
    【Network architecture】Rethinking the Inception Architecture for Computer Vision（inception-v3）論文解析
      傳統   tps   聚合   更遠   瓶頸   orm   -o   分類   每一個   0. paper link
inception-v3
1. Overview
??這篇文章很多“經驗”性的東西，因此會寫的比較細，把文章裏的一些話摘取出來，多學習一下，希望對以後自己設計網絡有幫助。
2. Four 

  
 

    

    
    運用java 呼叫tensorflow中的inception v3模型
       
 
 首先使用maven新增依賴項： 
 <?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3 

  
 

    

    
    實驗二：對Inception v3模型的對抗樣本
      轉載自知乎： 
TensorFlow 教程 #11 - 對抗樣本 
https://zhuanlan.zhihu.com/p/27241550 
本文主要演示瞭如何給影象增加“對抗噪聲”，以此欺騙模型，使其誤分類。 
實驗程式碼：http://localhost:8888/notebooks/1/Tensor 

  
 

    

    
    《Inception V3-Rethinking the Inception Architecture for Computer Vision》論文筆記
       
 
  
  
 1. 論文思想 
 在其它條件都滿足的（資料充足且足夠好）的情況下，增加模型的尺寸以及計算量會帶來實質上的優勢，但是可供計算的資源總是有限的，特別是在移動裝置上，並不能無節制的增加模型的尺寸。例如，在VggNet模型中使用的引數量是AlexNet引數量的三倍，實際取得的效果也是好於Ale 

  
 

    

    
    基於Inception v3進行多標籤訓練 修正了錯誤並進一步完善了程式碼
       
 
 多標籤訓練只適合未修改inception v3網路的情形，不同於遷移學習。本文參考了基於Inception v3進行多標籤訓練  修正了錯誤並進一步完善了程式碼 
 資料集的準備，假設有3個類，每個類別差不多有50張圖，注意圖片的規模不能太少(一般一個類不小於25張圖)，不然在驗證的時候會 

  
 

    

    
    【深度學習】GoogLeNet系列解讀 —— Inception v3
       
 
 
 
 目錄 
 GoogLeNet系列解讀 
 Inception v1 
 Inception v2 
 Inception v3 
 Inception v4 
 
 Inception v3 
 Inception v3整體上採用了Inception v2的網路結構，並在優化演算法、正則化等 

  
 

    

    
    [深度學習]Object detection物體檢測之YOLO v3(9)
       
 
 論文全稱：《YOLOv3: An Incremental Improvement》 
 論文地址：https://pjreddie.com/media/files/papers/YOLOv3.pdf 
 這是我目前看過最輕鬆詼諧的論文，作者的開頭特別有意思。他說自己過去一年花了很多時間在推特上面，也 

  
 

    

    
    tensorflow-Inception-v3模型訓練自己的資料程式碼示例
      一、宣告 
　　本程式碼非原創，源網址不詳，僅做學習參考。 
二、程式碼　　 

 
  
  
    1 # -*- coding: utf-8 -*-
  2 
  3 import glob  # 返回一個包含有匹配檔案/目錄的陣列
  4 import os.path
  5 import rand 

  
 

    

    
    SE-Inception v3架構的模型搭建（keras程式碼實現）
      
                首先，先上SENet架構的原理圖：



圖是將SE模組嵌入到Inception結構的一個示例。方框旁邊的維度資訊代表該層的輸出。這裡我們使用global average pooling作為Squeeze操作。緊接著兩個Fully Connected 層組成一個Bottlen 

  
 

    

    
    tensorflow利用Inception-v3實現遷移學習
      
                1、Tensorflow 實現遷移學習。
#photo地址：
#http://download.tensorflow.org/example_images/flower_photos.tgz
#Inception-v3模型
#https://storage.googleapi