機器學習筆記（二十二）——Tensorflow 2 (ImageDataGenerator)

阿新 • • 發佈：2021-08-10

本部落格僅用於個人學習，不用於傳播教學，主要是記自己能夠看得懂的筆記（

學習知識來自：【吳恩達團隊Tensorflow2.0實踐系列課程第一課】TensorFlow2.0中基於TensorFlow2.0的人工智慧、機器學習和深度學習簡介及基礎程式設計_嗶哩嗶哩_bilibili

我已經改了一天的ImageDataGenerator課程的bug了，人都傻了。

ImageDataGenerator，我簡稱IDG，是一種非常方便（doge）的給大量圖片添上標籤的並且調整圖片大小的API。在吳恩達的視訊裡，給出的例子是分辨人和馬，但是一直沒找到資料。最後，我終於找到一篇部落格來救我：Tensorflow實現人馬圖片的分類器 [使用ImageDataGenerator 無需人為標註資料]_STILLxjy-CSDN部落格

哈利路亞！

部落格裡給出了資料集的下載地址，以及實現程式碼，應該也是聽過吳恩達機器學習系列課程的大佬。

值得注意的是，測試資料要自己製作哦，或者去網上找。

接下來給出我的檔案目錄，提醒一下自己：我的檔案是這麼存的：

然後有一件事：吳恩達課程裡的那位大佬在獲取檔名的時候用了google.colab，那個部落格裡的大佬也用了，但是看B站的彈幕說安裝google.colab的話會破壞安裝了tensorflow的conda虛擬環境，所以不建議使用。於是我換了一種獲取檔案的方法。

接下來就直接上程式碼：

from tensorflow.keras.preprocessing import image
 
from tensorflow.keras.preprocessing.image import ImageDataGenerator as idg
import tensorflow as tf
import numpy as np
import os
from tensorflow.keras.optimizers import RMSprop

class myCallback(tf.keras.callbacks.Callback): #Callback類的繼承類
    def on_epoch_end(self,epoch,logs={}): #重寫on_epoch_end函式
        if 
 logs.get('val_loss')<10 and logs.get('val_accuracy')>0.80:
            print('\nReached 80% accuracy so canceling training.')
            self.model.stop_training=True #達到條件，停止訓練

callback=myCallback()
filepath=os.path.abspath(__file__) #獲取本檔案的絕對路徑
filepath=os.path.dirname(filepath) #獲取本檔案的父目錄
files=[]
for root,dirs,files in os.walk(filepath+'/tmp/test-horse-or-human'): #獲取test-horse-or-human目錄下的所有檔名
    used_up_variable=0

datagen=idg(rescale=1./255) #帶歸一化的generator
traingen=datagen.flow_from_directory( #訓練資料集，並用資料夾名作為標籤分類
    filepath+'/tmp/horse-or-human', #資料集所在地址
    target_size=(300,300), #自動生成300*300的圖片
    batch_size=2, #這個不能太大，不然會超記憶體，所以我這個程式執行得賊慢
    class_mode='binary' #二分類模式
)
valigen=datagen.flow_from_directory( #驗證資料集
    filepath+'/tmp/validation-horse-or-human',
    target_size=(300,300),
    batch_size=2,
    class_mode='binary'
)

model=tf.keras.Sequential([
    tf.keras.layers.Conv2D(16,(3,3),activation='relu',input_shape=(300,300,3)),
    tf.keras.layers.MaxPooling2D(2,2),
    tf.keras.layers.Conv2D(32,(3,3),activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),
    tf.keras.layers.Conv2D(64,(3,3),activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(512,activation='relu'),
    tf.keras.layers.Dropout(0.2),
    tf.keras.layers.Dense(1,activation='sigmoid')
])

model.compile(
    optimizer=RMSprop(lr=0.0005), #新型優化器
    loss='binary_crossentropy', #二分類交叉熵
    metrics=['accuracy']
)
print(model.summary())

model.fit( #訓練模型
    traingen, #訓練資料集generator
    steps_per_epoch=514, #注意前面的batch_size，這兩個乘起來要大於等於資料集個數
    epochs=15,
    validation_data=valigen, #驗證資料集generator
    validation_steps=128, #這個與驗證資料集的batch_size也是一樣，乘起來大於等於驗證資料集個數
    callbacks=[callback]
)

for file in files:
    pat=filepath+'/tmp/test-horse-or-human/testdata/'+file
#    img=cv2.imdecode(np.fromfile(pat,dtype=np.uint8),-1)
    img=image.load_img(pat,target_size=(300,300)) #匯入影象
    x=image.img_to_array(img) #變成array類
    imgs=np.expand_dims(x,axis=0) #增加一個維度

#    imgs=np.vstack([imgs])
    imgs=imgs/255.0 #歸一化，應該可以不用，因為不用的話驗證準確率會高一點。。。
    classes=model.predict(imgs,batch_size=10)
    print(classes[0]) #輸出預測值
    if classes[0]>0.5:
        print(file+' is a human.')
    else:
        print(file+' is a horse.')

得到結果：

Epoch 15/15
514/514 [==============================] - 59s 115ms/step - loss: 0.0021 - accuracy: 0.9990 - val_loss: 12.9590 - val_accuracy: 0.7969
[0.]
horse.png is a horse.
[1.151766e-30]
horse1.jpeg is a horse.
[0.99999046]
horse2.jpg is a human.
[1.]
human.png is a human.
[1.]
human1.jpeg is a human.
[0.]
human2.jpeg is a horse.
[0.]
human3.png is a horse.
[8.684954e-06]
human4.jpg is a horse.
[2.733185e-09]
test.png is a horse.