機器學習框架ML.NET學習筆記【8】目標檢測（採用YOLO2模型）

阿新 • • 發佈：2019-06-04

一、概述

本篇文章介紹通過YOLO模型進行目標識別的應用，原始程式碼來源於：https://github.com/dotnet/machinelearning-samples

實現的功能是輸入一張圖片，對圖片中的目標進行識別，輸出結果在圖片中通過紅色框線標記出來。如下：

YOLO簡介

YOLO（You Only Look Once）是一種最先進的實時目標檢測系統。官方網站：https://pjreddie.com/darknet/yolo/

本文采用的是TinyYolo2模型，可以識別的目標型別包括："aeroplane", "bicycle", "bird", "boat", "bottle","bus", "car", "cat", "chair", "cow","diningtable", "dog", "horse", "motorbike", "person","pottedplant", "sheep", "sofa", "train", "tvmonitor" 。

ONNX簡介

ONNX 即Open Neural Network Exchange（開放神經網路交換格式），是一個用於表示深度學習模型的通用標準，可使模型在不同框架之間進行互相訪問，其規範及程式碼主要由微軟，亞馬遜，Facebook 和 IBM 等公司共同制定與開發。有了ONNX標準，我們就可以在ML.NET程式碼中使用通過其他機器學習框架訓練並儲存的模型。

二、程式碼分析

1、Main方法

        static void Main(string[] args)
        {
            TrainAndSave();
            LoadAndPredict();

            Console.WriteLine("Press any key to exit!");
            Console.ReadKey();
        }

第一次執行時需要執行TrainAndSave方法，生成本地模型後，可以直接執行生產程式碼。

2、訓練並儲存模型

 　　　　static readonly string tagsTsv = Path.Combine(trainImagesFolder,  "tags.tsv");       
　　　　 private static void TrainAndSave()
        {
            var mlContext = new MLContext();
            var trainData = mlContext.Data.LoadFromTextFile<ImageNetData>(tagsTsv);

            var pipeline = mlContext.Transforms.LoadImages(outputColumnName: "image", imageFolder: trainImagesFolder, inputColumnName: nameof(ImageNetData.ImagePath))
                    .Append(mlContext.Transforms.ResizeImages(outputColumnName: "image", imageWidth: ImageNetSettings.imageWidth, imageHeight: ImageNetSettings.imageHeight, inputColumnName: "image"))
                    .Append(mlContext.Transforms.ExtractPixels(outputColumnName: "image"))
                    .Append(mlContext.Transforms.ApplyOnnxModel(modelFile: YOLO_ModelFilePath, outputColumnNames: new[] { TinyYoloModelSettings.ModelOutput }, inputColumnNames: new[] { TinyYoloModelSettings.ModelInput }));

            var model = pipeline.Fit(trainData);

            using (var file = File.OpenWrite(ObjectDetectionModelFilePath))
                mlContext.Model.Save(model, trainData.Schema, file);

            Console.WriteLine("Save Model success!");
        }

ImageNetData類定義如下：

    public class ImageNetData
    {
        [LoadColumn(0)]
        public string ImagePath;

        [LoadColumn(1)]
        public string Label;
    }

tags.tsv檔案中僅包含一條樣本資料，因為模型已經訓練好，不存在再次訓練的意義。這裡只要放一張圖片樣本即可，通過Fit方法建立資料處理通道模型。

ApplyOnnxModel方法載入第三方ONNX模型，

    public struct TinyYoloModelSettings
    {
        // input tensor name
        public const string ModelInput = "image";

        // output tensor name
        public const string ModelOutput = "grid";
    }

其中，輸入、輸出的列名稱是指定的。可以通過安裝Netron這樣的工具來查詢ONNX檔案的詳細資訊，可以看到輸入輸出的資料列名稱。


3、應用

        private static void LoadAndPredict()
        {
            var mlContext = new MLContext();

            ITransformer trainedModel;
            using (var stream = File.OpenRead(ObjectDetectionModelFilePath))
            {
                trainedModel = mlContext.Model.Load(stream, out var modelInputSchema);               
            }
            var predictionEngine = mlContext.Model.CreatePredictionEngine<ImageNetData, ImageNetPrediction>(trainedModel);

            DirectoryInfo testdir = new DirectoryInfo(testimagesFolder);
            foreach (var jpgfile in testdir.GetFiles("*.jpg"))
            {  
                ImageNetData image = new ImageNetData
                {
                    ImagePath = jpgfile.FullName
                };               
                var Predicted = predictionEngine.Predict(image);
                PredictImage(image.ImagePath, Predicted);                 
            }
        }

程式碼遍歷一個資料夾下面的JPG檔案。對每一個檔案進行轉換，獲得預測結果。

ImageNetPrediction類定義如下：

    public class ImageNetPrediction
    {
        [ColumnName(TinyYoloModelSettings.ModelOutput)]
        public float[] PredictedLabels;       
    }

輸出的“grid”列資料是一個float陣列，不能直接理解其含義，所以需要通過程式碼將其資料轉換為便於理解的格式。

     YoloWinMlParser _parser = new YoloWinMlParser();
     IList<YoloBoundingBox> boundingBoxes = _parser.ParseOutputs(Predicted.PredictedLabels, 0.4f);

YoloWinMlParser.ParseOutputs方法將float陣列轉為YoloBoundingBox物件的列表，第二個引數是可信度闕值，只輸出大於該可信度的資料。

YoloBoundingBox類定義如下：

    class YoloBoundingBox
    {    
        public string Label { get; set; }
        public float Confidence { get; set; }

        public float X { get; set; }
        public float Y { get; set; }
        public float Height { get; set; }
        public float Width { get; set; }
        public RectangleF Rect
        {
            get { return new RectangleF(X, Y, Width, Height); }
        }
    }

其中：Label為目標型別，Confidence為可行程度。

由於YOLO的特點導致對同一個目標會輸出多個同樣的檢測結果，所以還需要對檢測結果進行過濾，去掉那些高度重合的結果。

     YoloWinMlParser _parser = new YoloWinMlParser();
     IList<YoloBoundingBox> boundingBoxes = _parser.ParseOutputs(Predicted.PredictedLabels, 0.4f); 
     var filteredBoxes = _parser.NonMaxSuppress(boundingBoxes, 5, 0.6F);

YoloWinMlParser.NonMaxSuppress第二個引數表示最多保留多少個結果，第三個引數表示重合率闕值，將去掉重合率大於該值的記錄。

四、資源獲取

原始碼下載地址：https://github.com/seabluescn/Study_ML.NET

工程名稱：YOLO_ObjectDetection

資源獲取：https://gitee.com/seabluescn/ML_Assets （ObjectDetection）

點選檢視機器學習框架ML.NET學習筆記系列文章

機器學習框架ML.NET學習筆記【8】目標檢測（採用YOLO2模型）

機器學習框架ML.NET學習筆記【8】目標檢測（採用YOLO2模型）

機器學習框架ML.NET學習筆記【1】基本概念

機器學習框架ML.NET學習筆記【2】入門之二元分類

機器學習框架ML.NET學習筆記【3】文字特徵分析

機器學習框架ML.NET學習筆記【4】多元分類之手寫數字識別

機器學習框架ML.NET學習筆記【6】TensorFlow圖片分類

機器學習框架ML.NET學習筆記【7】人物圖片顏值判斷

機器學習框架ML.NET學習筆記【9】自動學習

.NET深度學習框架ML.NET入門筆記（一）

【scoi2009】圍豆豆（最短路模型）

一個開源的，跨平臺的.NET機器學習框架ML.NET

學習Python資料分析隨手筆記【三】numpy陣列的函式ix_()

《TensorFlow：實戰Google深度學習框架》--5.2.1 MNIST手寫識別問題（程式已改進）

《TensorFlow：實戰Google深度學習框架》——6.3 卷積神經網路常用結構（池化層）

caffe深度學習【九】目標檢測 yolo v1的caffe實現基於VOC2007資料集

洛谷 P2634 BZOJ 2152 【模板】點分治（聰聰可可）

【BZOJ3669】【Noi2014】魔法森林（Link-Cut Tree）

【BZOJ2816】【ZJOI2012】網絡（Link-Cut Tree）

【Poj1273】Drainage Ditches（網絡流）

【BZOJ4530】大融合（Link-Cut Tree）

機器學習框架ML.NET學習筆記【8】目標檢測（採用YOLO2模型）

相關推薦