使用預先訓練好的模型來訓練小訓練集tf-keras實現

阿新 • • 發佈：2020-11-04

> //20201104

> 寫在前面：最近在練手keras的相關專案，今天做了一個使用預訓練模型訓練小資料集的專案，在此記錄總結一下

> ps：本門採用markdown語法，可用markdown文件編輯器開啟

### 首先解釋一下名詞：

- 預訓練模型（pre-trained model）：在網路上別人（或者團隊）預先花費很多時間在大資料集上訓練並上傳權重引數的網路（模型）

- 小資料集合（small data set）：現實生活中，更多的時候並沒有那麼多資料（經過標註）用於訓練，這個時候如何在小資料集上做出高效的預測就顯得異常重要

> 需要注意的是：別人訓練過的模型並不能直接拿來就重新訓練的，使用他人訓練好的的模型（此處拿CNN舉例），一般使用網路中用於提取特徵的層（layer），往往最後的全連線層以及密集層並不使用（因為最後幾層所代表的資訊可能只針對該網路當時訓練的資料集——不具有普適性，而淺層的卷積層用於提取特徵，具有更普遍的用途；另，如果是非常深的網路，高層的卷積層提取的特徵會更抽象，往往在訓練自己的小資料集也不使用——比如貓狗分類任務中，高層卷積層提取的特徵可能是耳朵眼睛鼻子之類很具體的東西，而淺層則是提取顏色紋理之類普遍的特徵，這樣的預訓練網路層用在大象分類任務中明顯就不合適[舉個栗子~]）

### 本文使用的資料集是貓狗資料集，下載連結為（來源_kaggle）：https://www.kaggle.com/c/dogs-vs-cats/data

- 本文有三種使用預訓練模型的方法

　　- 通過卷積基底提取特徵然後儲存為numpy矩陣餵給後續自定義層次

　　- 將卷積基底層凍結，然後在其後拼接自定義層次

　　- 微調：將高層基底層解凍，跟隨訓練集重新訓練

### 第一種方法——特徵提取並儲存為numpy矩陣_缺點：不能使用影象增強，優點：快

#### 1.資料準備階段

- 首先在目錄下建立一個data資料夾，將下載的壓縮包在此資料夾下解壓，然後執行以下程式碼（程式碼目的是將源資料中隨機選擇需要訓練數目的資料，並將其分類到新的目錄）來準備資料

import os, shutil

# 專案的根目錄路徑
ROOT_DIR = os.getcwd()

# 置放coco影象資料與標註資料的目錄
DATA_PATH = os.path.join(ROOT_DIR, "data")

# 原始資料集的路徑
original_dataset_dir = os.path.join(DATA_PATH, "train")

# 儲存小資料集的目錄
base_dir = os.path.join(DATA_PATH, "cats_and_dogs_small")
if not os.path.exists(base_dir): 
    os.mkdir(base_dir)

 
# 我們的訓練資料的目錄
train_dir = os.path.join(base_dir, 'train')
if not os.path.exists(train_dir): 
    os.mkdir(train_dir)

# 我們的驗證資料的目錄
validation_dir = os.path.join(base_dir, 'validation')
if not os.path.exists(validation_dir): 
    os.mkdir(validation_dir)

# 我們的測試資料的目錄
test_dir = os.path.join(base_dir, 'test')
if not os.path.exists(test_dir):
    os.mkdir(test_dir)    

# 貓的圖片的訓練資料目錄
train_cats_dir = os.path.join(train_dir, 'cats')
if not os.path.exists(train_cats_dir):
    os.mkdir(train_cats_dir)

# 狗的圖片的訓練資料目錄
train_dogs_dir = os.path.join(train_dir, 'dogs')
if not os.path.exists(train_dogs_dir):
    os.mkdir(train_dogs_dir)

# 貓的圖片的驗證資料目錄
validation_cats_dir = os.path.join(validation_dir, 'cats')
if not os.path.exists(validation_cats_dir):
    os.mkdir(validation_cats_dir)

# 狗的圖片的驗證資料目錄
validation_dogs_dir = os.path.join(validation_dir, 'dogs')
if not os.path.exists(validation_dogs_dir):
    os.mkdir(validation_dogs_dir)

# 貓的圖片的測試資料目錄
test_cats_dir = os.path.join(test_dir, 'cats')
if not os.path.exists(test_cats_dir):
    os.mkdir(test_cats_dir)

# 狗的圖片的測試資料目錄
test_dogs_dir = os.path.join(test_dir, 'dogs')
if not os.path.exists(test_dogs_dir):
    os.mkdir(test_dogs_dir)
    
# 複製前1000個貓的圖片到train_cats_dir
fnames = ['cat.{}.jpg'.format(i) for i in range(1000)]
for fname in fnames:
    src = os.path.join(original_dataset_dir, fname)
    dst = os.path.join(train_cats_dir, fname)
    if not os.path.exists(dst):
        shutil.copyfile(src, dst)

print('Copy first 1000 cat images to train_cats_dir complete!')

# 複製下500個貓的圖片到validation_cats_dir
fnames = ['cat.{}.jpg'.format(i) for i in range(1000, 1500)]
for fname in fnames:
    src = os.path.join(original_dataset_dir, fname)
    dst = os.path.join(validation_cats_dir, fname)
    if not os.path.exists(dst):
        shutil.copyfile(src, dst)

print('Copy next 500 cat images to validation_cats_dir complete!')

# 複製下500個貓的圖片到test_cats_dir
fnames = ['cat.{}.jpg'.format(i) for i in range(1500, 2000)]
for fname in fnames:
    src = os.path.join(original_dataset_dir, fname)
    dst = os.path.join(test_cats_dir, fname)
    if not os.path.exists(dst):
        shutil.copyfile(src, dst)

print('Copy next 500 cat images to test_cats_dir complete!')

# 複製前1000個狗的圖片到train_dogs_dir
fnames = ['dog.{}.jpg'.format(i) for i in range(1000)]
for fname in fnames:
    src = os.path.join(original_dataset_dir, fname)
    dst = os.path.join(train_dogs_dir, fname)
    if not os.path.exists(dst):
        shutil.copyfile(src, dst)

print('Copy first 1000 dog images to train_dogs_dir complete!')


# 複製下500個狗的圖片到validation_dogs_dir
fnames = ['dog.{}.jpg'.format(i) for i in range(1000, 1500)]
for fname in fnames:
    src = os.path.join(original_dataset_dir, fname)
    dst = os.path.join(validation_dogs_dir, fname)
    if not os.path.exists(dst):
        shutil.copyfile(src, dst)

print('Copy next 500 dog images to validation_dogs_dir complete!')

# C複製下500個狗的圖片到test_dogs_dir
fnames = ['dog.{}.jpg'.format(i) for i in range(1500, 2000)]
for fname in fnames:
    src = os.path.join(original_dataset_dir, fname)
    dst = os.path.join(test_dogs_dir, fname)
    if not os.path.exists(dst):
        shutil.copyfile(src, dst)
    
print('Copy next 500 dog images to test_dogs_dir complete!')

#### 2.匯入相應的包

import os
import tensorflow  as tf
from tensorflow import keras
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
from IPython.display import Image
import numpy as np
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import models
from tensorflow.keras import layers
from tensorflow.keras import optimizers

#### 3.從網路上載入keras內建VGG16模型（需kexue上網）

conv_base = keras.applications.VGG16(
    weights='imagenet',
    include_top = False,# 這裡告訴keras我們只需要卷積基底的引數
    input_shape=(150,150,3)
)

print(conv_base.summary())

#### 4.建立訓練資料生成器並使用卷及基層提取特徵

datagen = ImageDataGenerator(rescale=1./255)

batch_size = 20

def extract_features(directory,sample_count):
    features = np.zeros(shape=(sample_count,4,4,512))
    labels = np.zeros(shape=(sample_count))

    generator = datagen.flow_from_directory(
        directory,
        target_size=(150,150),
        batch_size=batch_size,
        class_mode='binary'# 因為此次訓練目標是二分類問題
    )

    i = 0
    for inputs_batch,labels_batch in generator:
        features_batch = conv_base.predict(inputs_batch)# 此處將卷積層輸出的矩陣儲存至本地,讓需要分類的圖片通過已經訓練過的卷積基底層
        features[i*batch_size:(i+1)*batch_size] = features_batch
        labels [i*batch_size:(i+1)*batch_size] = labels_batch
        i += 1
        if i*batch_size >= sample_count:
            break

    print('extract_features complete!')
    return features,labels

'''
如果執行過之前資料準備程式碼，則以下路徑配置程式碼可以註釋掉（屬於重複程式碼）
'''
base_dir = 'data/cats_and_dogs_small'
train_dir = os.path.join(base_dir, 'train')
validation_dir = os.path.join(base_dir, 'validation')
test_dir = os.path.join(base_dir, 'test')
train_features,train_labels = extract_features(train_dir,2000)
validation_features,validation_labels = extract_features(validation_dir,1000)
test_features,test_labels = extract_features(test_dir,1000)

train_features = np.reshape(train_features,(2000,4*4*512))
validation_features = np.reshape(validation_features,(1000,4*4*512))
test_features = np.reshape(test_features,(1000,4*4*512))

#### 5.使用keras序列模型建立卷積基底層後的層（用於輸出）

model = models.Sequential([
    layers.Dense(256,activation='relu',input_dim=4*4*512),
    layers.Dropout(rate=0.5),
    layers.Dense(1,activation='sigmoid')
])

#### 6.編譯&訓練模型

model.compile(optimizer=optimizers.RMSprop(lr = 2e-5),
              loss = 'binary_crossentropy',
              metrics=['acc'])

history = model.fit(
    train_features,train_labels,
    epochs = 30,
    batch_size=20,
    validation_data=(validation_features,validation_labels)
)

#### 7.視覺化（此處視覺化使用子圖方式將本文三個模組_第一種方法、第二種方法、微調、平滑後圖像集合在一張最後的圖上了，可以自行改為每執行一個模組展示一次圖片（提示：如果沒有gpu，後兩個模組執行的會非常的慢））

acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']

epoches = range(len(acc))

fig,ax = plt.subplots(2,4)
plt.subplots_adjust(wspace=1,hspace=1)
ax = ax.flatten()
ax[0].plot(epoches,acc,label='Training acc')
ax[0].plot(epoches,val_acc,label = 'Validation acc')
ax[0].set_title('Training and validation accuracy')
ax[0].legend()


ax[1].plot(epoches,loss,label='Training loss')
ax[1].plot(epoches,val_loss,label='Validation loss')
ax[1].set_title('Training and validation loss')
ax[1].legend()

### 第二種方法（網路拼接，統一訓練_需要凍結基底層）_優點：可以使用影象增強，缺點：慢

#### 1.使用kera序列模型建立model：

model = models.Sequential([
    conv_base,
    layers.Flatten(),
    layers.Dense(256,activation='relu'),
    layers.Dense(1,activation='sigmoid')
])
#輸出凍結之前需要訓練的引數

print('This is the number of trainable weights before freezing the conv base:',len(model.trainable_weights))
# 輸出凍結之後需要訓練的引數
conv_base.trainable = False
print('this is the number of trainable wights after freezing the conv base',len(model.trainable_weights))

#### 2.定義訓練、測試、交叉資料集&資料流

train_datagen = ImageDataGenerator(
    rescale=1./255,
    rotation_range=40,
    width_shift_range=40,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,
    fill_mode='nearest'
)
# 測試資料集不用影象增強
test_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_directory(
    train_dir,
    target_size=(150,150),
    batch_size=20,
    class_mode='binary'
)

validation_genetator = test_datagen.flow_from_directory(
    validation_dir,
    target_size=(150,150),
    batch_size=20,
    class_mode='binary'
)

#### 3.編譯&訓練&儲存模型

model.compile(optimizer=optimizers.RMSprop(lr = 2e-5),
              loss='binary_crossentropy',
              metrics=['acc'])

model.fit_generator(
    train_generator,
    steps_per_epoch=100,
    epochs=30,
    validation_data=validation_genetator,
    validation_steps=50,
    verbose=2
)
# 已在目錄下建立一個weight目錄用於儲存權重
model.save('./weight/cats_anddogs_small_3.h5')

#### 4.視覺化

acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']

epochs = range(len(acc))

ax[2].plot(epoches,acc,label='Training acc')
ax[2].plot(epoches,val_acc,label = 'Validation acc')
ax[2].set_title('Training and validation accuracy')
ax[2].legend()

ax[3].plot(epoches,loss,label='Training loss')
ax[3].plot(epoches,val_loss,label='Validation loss')
ax[3].set_title('Training and validation loss')
ax[3].legend()

### 微調（fine-tune）

- 在使用前兩種方法訓練完網路一次之後（保證自定義密集、輸出層的引數誤差不會很大），將高層卷積層（提取抽象特徵）解凍，重新跟隨訓練集微調引數（使用小學習率來保證“微調”）

#### 1.選擇並啟動需要解凍的層次（使用層次名稱作為索引）

conv_base.trainable = True

layers_frozen = ['block5_conv1','block5_conv2','block5_conv3','block5_pool']# go get the layer will be frozen
for layer in conv_base.layers:
    if layer.name in layers_frozen:
        layer.trainable = True
    else:
        layer.trainable = False
for layer in conv_base.layers:
    print("{}:{}".format(layer.name,layer.trainable))

#### 2.編譯&訓練模型（使用第二種方法訓練後的模型重新對資料集進行訓練）

model.compile(loss='binart_crossentropy',
              optimizer=optimizers.RMSprop(lr=1e-5),
              metrics=['acc'])

history = model.fit_generator(
    train_generator,
    steps_per_epoch=100,
    epochs = 100,
    validation_data = validation_genetator,
    validation_steps = 50
)

model.save('cats_and_dogs_small_4.h5')

#### 3.視覺化

acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']

epochs = range(len(acc))

ax[4].plot(epoches,acc,label='Training acc')
ax[4].plot(epoches,val_acc,label = 'Validation acc')
ax[4].set_title('Training and validation accuracy')
ax[4].legend()

ax[5].plot(epoches,loss,label='Training loss')
ax[5].plot(epoches,val_loss,label='Validation loss')
ax[5].set_title('Training and validation loss')
ax[5].legend()

#### 4.步驟3中視覺化影象匯成鋸齒狀，此處使用一個平滑方法來是影象更整潔（原理沒有搞懂，公式是0.8previous+0.2point，來源於github專案）

def smooth_curve(points,factor = 0.8):
    smoothed_points = []
    for point in points:
        if smoothed_points:
            previous = smoothed_points[-1]
            smoothed_points.append(previous*factor + point*(1-factor))
        else:
            smoothed_points.append(point)
    return smoothed_points

ax[6].plot(epoches,smooth_curve(acc),label='Training acc')
ax[6].plot(epoches,smooth_curve(val_acc),label = 'Validation acc')
ax[6].set_title('Training and validation accuracy')
ax[6].legend()

ax[7].plot(epoches,smooth_curve(loss),label='Training loss')
ax[7].plot(epoches,smooth_curve(val_loss),label='Validation loss')
ax[7].set_title('Training and validation loss')
ax[7].legend()

plt.show()

#### 5.輸出測試資料集準確率

test_generator = test_datagen.flow_from_directory(
        test_dir,
        target_size=(150, 150),
        batch_size=20,
        class_mode='binary')

test_loss, test_acc = model.evaluate_generator(test_generator, steps=50)
print('test acc:', test_acc)

### 整個專案影象彙總：

另：由於影象尚未儲存，明天更新加上

以上

希望對大家有所幫助

使用預先訓練好的模型來訓練小訓練集tf-keras實現

使用預先訓練好的模型來訓練小訓練集tf-keras實現

Pytorch中GPU訓練好模型CPU下使用

使用Keras訓練好的.h5模型來測試一個例項

python介面呼叫已訓練好的caffe模型測試分類方法

解決Pytorch 載入訓練好的模型遇到的error問題

從訓練好的tensorflow模型中列印訓練變數例項

Tensorflow實現在訓練好的模型上進行測試

pytorch 使用載入訓練好的模型做inference

如何將tensorflow訓練好的模型移植到Android (MNIST手寫數字識別)

keras讀取訓練好的模型引數並把引數賦值給其它模型詳解

Keras 載入已經訓練好的模型進行預測操作

使用Keras預訓練好的模型進行目標類別預測詳解

訓練 CV 模型新思路來了：用 NLP 大火的 Prompt 替代微調，效能全面提升

利用keras載入訓練好的.H5檔案,並實現預測圖片

Keras使用ImageNet上預訓練的模型方式

TensorFlow實現模型斷點訓練,checkpoint模型載入方式

keras實現呼叫自己訓練的模型,並去掉全連線層

Python實現Keras搭建神經網路訓練分類模型教程

Keras 實現載入預訓練模型並凍結網路的層

keras實現theano和tensorflow訓練的模型相互轉換

使用預先訓練好的模型來訓練小訓練集tf-keras實現

相關推薦