TensorFlow 高階之二（卷積神經網路手寫字型識別）

一、資料集獲取

前言

在梯度下降和最優化部分用傳統的神經網路在MNIST資料集上得到了90%左右的準確率。結果其實並不太理想。
接下來，我們將使用卷積神經網路來得到一個準確率更高的模型，接近99%。卷積神經網路使用共享的卷積核對影象進行卷積操作，以提取影象深層特徵。這些深層特徵然後組合成特徵向量輸入全連線的神經網路中，再使用類似傳統神經網路的方法進行分類。

資料集概述

這裡以MNIST為資料集，MNIST是一個手寫數字0-9的資料集，它有60000個訓練樣本集和10000個測試樣本集它是NIST資料庫的一個子集。
在windows下直接下載: train-images-idx3-ubyte.gz / train-labels-idx1-ubyte.gz t10k-labels-idx1-ubyte.gz / t10k-images-idx3-ubyte.gz 四個檔案，影象資料都儲存在二進位制檔案中。每個樣本影象的寬高為28*28。

#####1.1 網路架構圖這裡寫圖片描述

二、資料感知與處理

2.1 導包

import os
import struct
import tensorflow as tf          
import matplotlib.pyplot as plt  
import numpy as np             
from sklearn.metrics import confusion_matrix    # 混淆矩陣，分析模型誤差

import time                                     # 計時
from datetime import timedelta
import 
 math

%matplotlib inline                              #使用notebook線上畫圖

2.2 匯入資料

def load_mnist(path, kind='train'):            #讀取資料函式
    #Load MNIST data from path
    labels_path = os.path.join(path, '%s-labels.idx1-ubyte' % kind)
    images_path = os.path.join(path, '%s-images.idx3-ubyte' % kind) 


    with open(labels_path, 'rb') as lbpath:
        magic, n = struct.unpack('>II',lbpath.read(8))
        labels = np.fromfile(lbpath, dtype=np.uint8)

    with open(images_path, 'rb') as imgpath:
        magic, num, rows, cols = struct.unpack(">IIII",imgpath.read(16))
        images = np.fromfile(imgpath, dtype=np.uint8).reshape(len(labels), 784)

    return images, labels

#匯入資料
X_train, y_train = load_mnist('./data/mnist', kind='train')
print('Rows: %d, columns: %d' % (X_train.shape[0], X_train.shape[1]))

X_test, y_test = load_mnist('./data/mnist', kind='t10k')
print('Rows: %d, columns: %d' % (X_test.shape[0], X_test.shape[1]))
print(y_train[:5],y_test[:5]) 

輸出： -------------------------------------------------
	Rows: 60000, columns: 784
	Rows: 10000, columns: 784
	[5 0 4 1 9] [7 2 1 0 4]

2.3 把標籤轉換為one-hot格式

def dense_to_one_hot(labels_dense, num_classes=10):
  """把類標籤轉換 one-hot向量."""
  num_labels = labels_dense.shape[0]
  index_offset = np.arange(num_labels) * num_classes
  labels_one_hot = np.zeros((num_labels, num_classes))
  labels_one_hot.flat[index_offset + labels_dense.ravel()] = 1
  return labels_one_hot

y_train01=dense_to_one_hot(y_train, num_classes=10)
y_test01=dense_to_one_hot(y_test, num_classes=10)

print(y_train01[:5])
print(y_test01[:5])

test=np.argmax(y_test01, axis=1)
print(test[:5])
print("樣本維度：", X_train.shape)
print("標籤維度：", y_train01.shape)

2.4. 資料維度

img_size = 28                          # 圖片的高度和寬度
img_size_flat = img_size * img_size    # 展平為向量的尺寸
img_shape = (img_size, img_size)       # 圖片的二維尺寸
num_channels = 1                       # 輸入為單通道灰度影象
num_classes = 10                       # 類別數目

2.5. 列印部分樣例圖片

def plot_images(images, cls_true, cls_pred=None):
    """
     繪製圖像，輸出真實標籤與預測標籤
     images: 影象（9 張）
     cls_true: 真實類別
     cls_pred: 預測類別
    """
    assert len(images) == len(cls_true) == 9    # 保證存在 9 張圖片
    fig, axes = plt.subplots(3, 3)              # 建立 3x3 個子圖的畫布
    fig.subplots_adjust(hspace=0.3, wspace=0.3) # 調整每張圖之間的間隔
    for i, ax in enumerate(axes.flat):
        # 繪圖，將一維向量變為二維矩陣，黑白二值影象使用 binary
        ax.imshow(images[i].reshape(img_shape), cmap='binary')
        if cls_pred is None:                  # 如果未傳入預測類別
            xlabel = "True: {0}".format(cls_true[i])
        else:
            xlabel = "True: {0}, Pred: {1}".format(cls_true[i], cls_pred[i])
        ax.set_xlabel(xlabel)
        ax.set_xticks([])                     # 刪除座標資訊
        ax.set_yticks([])
    plt.show()
    
indices = np.arange(len(test))
np.random.shuffle(indices)                   # 打亂資料 
indices = indices[:9]                        # 隨機取 9 張圖片
images = X_test[indices]
cls_true = test[indices]
plot_images(images, cls_true)

三、建立神經網路

3.1 定義網路引數—初始化（權重-偏值項）

# 卷積層 1 引數
filter_size1 = 5          # 5 x 5 卷積核
num_filters1 = 16         # 共 16 個卷積核

# 卷積層 2 引數
filter_size2 = 5          # 5 x 5 卷積核
num_filters2 = 36         # 共 36 個卷積核

# 全連線層 引數
fc_size = 128             # 全連線層神經元個數.

def new_weights(shape):
    return tf.Variable(tf.truncated_normal(shape, stddev=0.05)) # 初始化為隨機值
def new_biases(length):
    return tf.Variable(tf.constant(0.05, shape=[length]))      # 初始化為常數

3.2 定義卷積層

def new_conv_layer(input,                   # 前一層.
    num_input_channels,                     # 前一層通道數
    filter_size,                            # 卷積核尺寸
    num_filters,                            # 卷積核數目
    use_pooling=True):                      # 使用 2x2 max-pooling.
    # 卷積核權重的形狀，由 TensorFlow API 決定
    shape = [filter_size, filter_size, num_input_channels, num_filters]
    
    weights = new_weights(shape=shape)      # 根據跟定形狀建立權重
    biases = new_biases(length=num_filters) # 建立新的偏置，每個卷積核一個偏置
    """
        建立卷積層。注意 stride 全設定為 1。
        第1個和第4個必須是1，因為第1個是影象的數目，第4個是影象的通道。
        第2和第3指定和左右、上下的步長。
        padding設定為'SAME' 意味著給影象補零，以保證前後畫素相同。
    """
    layer = tf.nn.conv2d(input=input,
                        filter=weights,
                        strides=[1, 1, 1, 1],
                        padding='SAME')

    layer += biases     # 給卷積層的輸出新增一個偏置，每個卷積通道一個偏置值

    
    if use_pooling:    # 是否使用pooling
        # 這是 2x2 max-pooling, 表明使用 2x2 的視窗，選擇每一視窗的最大值作為該視窗的畫素，
        # 然後移動2格到下一視窗。
        layer = tf.nn.max_pool(value=layer,
                              ksize=[1, 2, 2, 1],
                              strides=[1, 2, 2, 1],
                              padding='SAME')

    # R(ReLU). 計算 max(x, 0)，把負數的畫素值變為0.為原輸出添加了一定的非線性特性
    layer = tf.nn.relu(layer)
    """
        注意 relu 通常在pooling前執行，但是由於 relu(max_pool(x)) == max_pool(relu(x))，
        我們可以通過先max_pooling再relu省去75%的計算。
        返回結果層和權重，結果層用於下一層輸入，權重用於顯式輸出
    """
    return layer, weights

3.3 定義展平層

def flatten_layer(layer):
    # 獲取輸入層的形狀，layer_shape == [num_images, img_height, img_width, num_channels]
    layer_shape = layer.get_shape()

    # 特徵數量: img_height * img_width * num_channels
    num_features = layer_shape[1:4].num_elements()
    """
        將形狀重塑為 [num_images, num_features].
        注意只設定了第二個維度的尺寸為num_filters，第一個維度為-1，保證第一個維度num_images不變
        展平後的層的形狀為:   [num_images, img_height * img_width * num_channels]
    """
    layer_flat = tf.reshape(layer, [-1, num_features])
    return layer_flat, num_features

3.4 定義全連線層

def new_fc_layer(input,         # 前一層.
                num_inputs,     # 前一層輸入維度
                num_outputs,    # 輸出維度
                use_relu=True): # 是否使用RELU

    # 更新——權重和偏置.
    weights = new_weights(shape=[num_inputs, num_outputs])
    biases = new_biases(length=num_outputs)

    # 計算 y = wx + b
    layer = tf.matmul(input, weights) + biases
    
    if use_relu:              # 是否使用RELU
        layer = tf.nn.relu(layer)
    return layer

3.5 定義佔位符

x = tf.placeholder(tf.float32, shape=[None, img_size_flat], name='x')          

 # 轉換為2維影象
x_image = tf.reshape(x, [-1, img_size, img_size, num_channels])                 

 # 原始輸出
y_true = tf.placeholder(tf.float32, shape=[None, num_classes], name='y_true') 

 # 轉換為真實類別，與之前的使用placeholder不同
y_true_cls = tf.argmax(y_true, axis=1)

3.6 連線神經網路（卷積層–展平層–連線層–預測類別–代價函式–優化方法–模型效能度量）

layer_conv1, weights_conv1 = \
new_conv_layer(input=x_image,                        # 輸入影象
              num_input_channels=num_channels,       # 輸入通道數
              filter_size=filter_size1,              # 卷積核尺寸
              num_filters=num_filters1,              # 卷積核數目
              use_pooling=True)
print(layer_conv1) 

layer_conv2, weights_conv2 = \
new_conv_layer(input=layer_conv1,
               num_input_channels=num_filters1,
               filter_size=filter_size2,
               num_filters=num_filters2,
               use_pooling=True)
print(layer_conv2) 

layer_flat, num_features = flatten_layer(layer_conv2)
print(layer_flat)

layer_fc1 = new_fc_layer(input=layer_flat,          # 展平層輸出
                         num_inputs=num_features,   # 輸入特徵維度
                         num_outputs=fc_size,       # 輸出特徵維度
                         use_relu=True)
print(layer_fc1)

layer_fc2 = new_fc_layer(input=layer_fc1,           # 上一全連線層
                         num_inputs=fc_size,        # 輸入特徵維度
                         num_outputs=num_classes,   # 輸出類別數
                         use_relu=False)
print(layer_fc2)

y_pred = tf.nn.softmax(layer_fc2)                   # softmax歸一化
y_pred_cls = tf.argmax(y_pred, axis=1)              # 真實類別

cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits=layer_fc2,labels=y_true)
cost = tf.reduce_mean(cross_entropy)

optimizer = tf.train.AdamOptimizer(learning_rate=1e-4).minimize(cost)  #優化器

correct_prediction = tf.equal(y_pred_cls, y_true_cls)
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

3.7 定義session 並執行

session = tf.Session()                           # 建立session
session.run(tf.global_variables_initializer())   # 變數初始化

def next_batch(num, data, labels):
    '''
    Return a total of `num` random samples and labels. 
    '''
    idx = np.arange(0 , len(data))
    np.random.shuffle(idx)                     #隨機獲取資料
    idx = idx[:num]
    data_shuffle = [data[i] for i in idx]
    labels_shuffle = [labels[i] for i in idx]

    return np.asarray(data_shuffle), np.asarray(labels_shuffle)  

train_batch_size = 64
total_iterations = 0           # 統計目前總迭代次數

def optimize(num_iterations):

    global total_iterations    # 迭代次數.
    start_time = time.time()   # 統計用時.

    for i in range(total_iterations, total_iterations + num_iterations):
        # 獲取一批資料，放入dict
        #x_batch, y_true_batch = data.train.next_batch(train_batch_size)
        x_batch, y_true_batch = next_batch(train_batch_size, X_train, y_train01)
        #x_batch = X_train.next_batch(train_batch_size)
        #y_true_batch = y_test01.next_batch(train_batch_size)
        feed_dict_train = {x: x_batch,
                           y_true: y_true_batch}
        session.run(optimizer, feed_dict=feed_dict_train)           # 執行優化器

        if i % 100 == 0 or i==899:                                  # 每100輪迭代輸出狀態
            acc = session.run(accuracy, feed_dict=feed_dict_train)  # 計算訓練集準確率.
            msg = "迭代輪次: {0:>6}, 訓練準確率: {1:>6.1%}"
            print(msg.format(i + 1, acc))

    total_iterations += num_iterations
    end_time = time.time()
    time_dif = end_time - start_time

    # 輸出用時.
    print("用時: " + str(timedelta(seconds=int(round(time_dif))))) 

def plot_example_errors(cls_pred, correct):
    # 計算錯誤情況
    incorrect = (correct == False)
    images = X_test[incorrect]
    cls_pred = cls_pred[incorrect]
    cls_true = test[incorrect]

    # 隨機挑選9個
    indices = np.arange(len(images))
    np.random.shuffle(indices)
    indices = indices[:9] 

    plot_images(images[indices], cls_true[indices], cls_pred[indices])

def plot_confusion_matrix(cls_pred):
    cls_true = test          # 真實類別  

    # 使用scikit-learn的confusion_matrix來計算混淆矩陣
    cm = confusion_matrix(y_true=cls_true, y_pred=cls_pred)

    print(cm)                # 列印混淆矩陣

    # 將混淆矩陣輸出為影象
    plt.imshow(cm, interpolation='nearest', cmap=plt.cm.Blues)
    plt.tight_layout()      # 調整影象
    plt.colorbar()
    tick_marks = np.arange(num_classes)
    plt.xticks(tick_marks, range(num_classes))
    plt.yticks(tick_marks, range(num_classes))
    plt.xlabel('Predicted')
    plt.ylabel('True')
    plt.show()

test_batch_size = 256                                  # 將測試集分成更小的批次   

def print_test_accuracy(show_example_errors=False,
                        show_confusion_matrix=False):
    num_test = len(X_test)                             # 測試集影象數量.

    cls_pred = np.zeros(shape=num_test, dtype=np.int)  # 為預測結果申請一個數組.

    i = 0                                              # 資料集的起始id為0
    while i < num_test:
        j = min(i + test_batch_size, num_test)         # j為下一批次的截止id
        images = X_test[i:j, :]                        # 獲取i，j之間的影象
        labels = y_test01[i:j, :]                      # 獲取相應標籤.

        feed_dict = {x: images,   y_true: labels}      # 建立feed_dict

        # 計算預測結果
        cls_pred[i:j] = session.run(y_pred_cls, feed_dict=feed_dict)
        i = j                                          # 設定為下一批次起始值.

    cls_true = test
    correct = (cls_true == cls_pred)                   # 正確的分類
    correct_sum = correct.sum()                        # 正確分類的數量

    acc = float(correct_sum) / num_test                # 分類準確率
    msg = "測試集準確率: {0:.1%}({1}/{2})"              # 列印準確率.
    print(msg.format(acc, correct_sum, num_test))

    if show_example_errors:                            # 列印部分錯誤樣例.
        print("Example errors:")
        plot_example_errors(cls_pred=cls_pred, correct=correct)

    if show_confusion_matrix:                          # 列印混淆矩陣.
        print("Confusion Matrix:")
        plot_confusion_matrix
              
           
              
              
            
            相關推薦
			   
            
            
            
 

    

    
    TensorFlow 高階之二 （卷積神經網路手寫字型識別）
      
							
							
							
一、資料集獲取
前言

在梯度下降和最優化部分用傳統的神經網路在MNIST資料集上得到了90%左右的準確率。結果其實並不太理想。
接下來，我們將使用卷積神經網路來得到一個準確率更高的模型，接近99%。卷積神經網路使用共享的卷積核對影象進行卷積操作，以提取影象深 

  
 

    

    
    TensorFlow.js 卷積神經網路手寫數字識別
       
  
  
 原博地址https://laboo.top/2018/11/21/tfjs-dr/ 
 原始碼 
 digit-recognizer 
 demo 
 https://github-laziji.github.io/digit-recognizer/ 演示開始時需要載入大概100M的訓練資料 

  
 

    

    
    eclipse擼一發Keras卷積神經網路對手寫數字識別
      
                                        
                                                一、導讀 
    1、window10 python環境Anaconda 安裝 
  & 

  
 

    

    
    神經網路-手寫字型識別
      3層神經網路，自定義輸入節點、隱藏層、輸出節點的個數，使用sigmoid函式作為啟用函式，梯度下降法進行權重的優化。 
使用MNIST資料集，進行手寫數字識別 
 
   1 #!/usr/bin/env python
  2 # -*- coding:utf-8 -*-
  3 
  4 #!/usr/bi 

  
 

    

    
    《TensorFlow：實戰Google深度學習框架》——6.2 卷積神經網路簡介（卷積神經網路的基本網路結構及其與全連線神經網路的差異）
       
 
 下圖為全連線神經網路與卷積神經網路的結構對比圖： 
  
 由上圖來分析兩者的差異：  
 
  
   
                    全連線神經網路與卷積網路相同點 
      &nb 

  
 

    

    
    Tensorflow例項：（卷積神經網路）LeNet-5模型
      
							
							
							通過卷積層、池化層等結構的任意組合得到的神經網路有無限多種，怎樣的神經網路更有可能解決真實的影象處理問題？本文通過LeNet-5模型，將給出卷積神經網路結構設計的一個通用模式。



LeNet-5模型

LeNet-5模型是Yann LeCun教授於1998年 

  
 

    

    
    Deep Learning模型之：CNN卷積神經網路（一）深度解析CNN
      
http://m.blog.csdn.net/blog/wu010555688/24487301

本文整理了網上幾位大牛的部落格，詳細地講解了CNN的基礎結構與核心思想，歡迎交流。










1. 概述
   卷積神經網路是一種特殊的深層的神經網路模型，它的特殊性體現在兩個方面，一方面它的神經元 

  
 

    

    
    TensorFlow的layer層搭建卷積神經網路（CNN），實現手寫體數字識別
      
                   目前正在學習使用TensorFlow，看到TensorFlow官方API上有一個呼叫layer層來搭建卷積神經網路（CNN）的例子，和我們之前呼叫的nn層的搭建卷積神經網路稍微有點不同。感覺layer層封裝性更強，直接輸入引數就可以是實現。程式碼如下：#-*- codi 

  
 

    

    
    二維卷積神經網路的結構理解
       
 
 
  
 針對這個圖，我們對應著卷積的api函式來說： 
 tf.nn.conv2d(input, filter, strides, padding, use_cudnn_on_gpu=None, name=None) 
   
 如上圖： 
 第一列為一張輸入影象， 大小為7*7*3，  

  
 

    

    
    《TensorFlow實戰》中AlexNet卷積神經網路的訓練中
      TensorFlow實戰中AlexNet卷積神經網路的訓練 
01 出錯 
TypeError: as_default() missing 1 required positional argument: 'self' 
經過百度、谷歌的雙重查詢，沒找到就具體原因。後面去TensorFlow官方文件中發現，tf 

  
 

    

    
    TensorFlow實現CNN卷積神經網路對手寫數字集mnist的模型訓練
       
 
 
  
  mnist手寫數字集相當於是TensorFlow應用中的Helloworld。 
  在學習了TensorFlow的卷積神經網路應用之後，今天就分步解析一下其應用過程 
  
  一、mnist手寫數字資料集 
 
         MN 

  
 

    

    
    【基於tensorflow的學習】經典卷積神經網路、模型的儲存和讀取
       
 
  CNN發展史： 
  
 1.經典卷積神經網路 
 以下僅列出關於CNN的深層次理解： 
 卷積層 
 tensorflow中卷積層的建立函式：_conv1 = tf.nn.conv2d(_input_r, tf.Variable(tf.random_normal([3, 3, 1, 6 

  
 

    

    
    目標檢測演算法基礎知識(二)－卷積神經網路知識
      
								
								            
							
							
							1.什麼是filter
通常一個6x6的灰度影象，構造一個3*3的矩陣，在卷積神經網路中稱之為filter,對６x6的影象進行卷積運算。
2.什麼是padding
假設輸出影象大小為nn與過濾器大小為f 

  
 

    

    
    關於CNN（卷積神經網路）中一些基本要點的簡要敘述
      
                

現階段卷積神經網路基本是以下幾個過程 ：

1.卷積（Convolution）

2.非線性處理(ReLu)

3.池化(Pooling)

4.全連線層進行分類(Fully Connected)



假設輸入影象可以是狗 ，貓，船，鳥，當我們輸入一張船的影象的時候，卷 

  
 

    

    
    深度學習（卷積神經網路）問題總結
       
 
   深度卷積網路 
   
    
   
  涉及問題： 
  1.每個圖如何卷積： 
    （1）一個圖如何變成幾個？ 
    （2）卷積核如何選擇？ 
  2.節點之間如何連線？ 
  3.S2-C3如何進行分配？ 
  4.1 

  
 

    

    
    幾種使用了CNN（卷積神經網路）的文字分類模型
      

下面就列舉了幾篇運用CNN進行文字分類的論文作為總結。

1 yoon kim 的《Convolutional Neural Networks for Sentence Classification》。（2014 Emnlp會議）

 
他用的結構比較簡單，就是使用長度不同的 filter 對文字矩陣進行 

  
 

    

    
    深度學習 之七 【卷積神經網路 CNN】
      
							
							
							1.CNN的應用




如果你能訓練人工智慧機器人唱歌，幹嘛還訓練它聊天？在 2017 年 4 月，研究人員使用 WaveNet 模型的變體生成了歌曲。原始論文和演示可以在 此處 找到。

瞭解 Facebook 的 創新 CNN 方法(Facebook) ， 

  
 

    

    
    CNN卷積神經網路應用於人臉識別（詳細流程+程式碼實現)和相應的超引數解釋
      
                

DeepLearning tutorial（5）CNN卷積神經網路應用於人臉識別（詳細流程+程式碼實現）


@author：wepon





本文主要講解將CNN應用於人臉識別的流程，程式基於Python+numpy+theano+PIL開發，採用類似LeNet5的 

  
 

    

    
    基於Tensorflow的機器學習(6) -- 卷積神經網路
      
							
							
							本篇部落格將基於tensorflow的estimator以及MNIST實現LeNet。具體實現步驟如下：

匯入必要內容



from __future__ import division, print_function, absolute_import

# 

  
 

    

    
    Convolution Neural Networks（卷積神經網路大家族）
       
 
  
  
 CNN原理： 
  
 受哺乳動物視覺系統的結構啟發，人們引入了一個處理圖片的強大模型結構，後來發展成了現代卷積網路的基礎。所謂卷積引自數學中的卷積運算： 
      
       
        
         
          S

TensorFlow 高階之二 （卷積神經網路手寫字型識別）

一、資料集獲取

二、資料感知與處理

2.1 導包

2.2 匯入資料

2.3 把標籤轉換為one-hot格式

2.4. 資料維度

2.5. 列印部分樣例圖片

三、建立神經網路

3.1 定義網路引數—初始化（權重-偏值項）

3.2 定義卷積層

3.3 定義展平層

3.4 定義全連線層

3.5 定義佔位符

3.6 連線神經網路（卷積層–展平層–連線層–預測類別–代價函式–優化方法–模型效能度量）

3.7 定義session 並執行

相關推薦

TensorFlow 高階之二（卷積神經網路手寫字型識別）