【論文筆記】One Millisecond Face Alignment with an Ensemble of Regression Trees

阿新 • • 發佈：2019-02-05

參考文獻：

Kazemi V, Sullivan J. One millisecond face alignment with an ensemble of regression trees[C]//Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on. IEEE, 2014: 1867-1874

簡介

CVPR 2014的一篇關於人臉關鍵點檢測的論文，基於Ensemble of Regression Tress演算法(以下簡稱 ERT)，速度極快（單人人臉關鍵點檢測耗時約為1ms），效果也不錯。同時又能處理訓練集中部分關鍵點標定缺失的情況。

dlib 庫中(http://dlib.net/)，有這個演算法的完整實現，包括訓練和測試部分。

有paper，有code，對我們這些搞演算法的底層人員，真是再happy不過了。

演算法介紹

LBF（Face Alignment at 3000 FPS via Regressing Local Binary Features）這篇論文也是基於Tree的人臉關鍵點檢測演算法。LBF是基於Tree的方法，學習每個關鍵點的區域性二值特徵，然後將特徵組合起來，使用線性迴歸檢測關鍵點。與 LBF 不同的是， ERT 是在學習 Tree的過程中，直接將 shape 的更新值 ΔS存入葉子結點 leaf node. 初始位置 S

在通過所有學習到的 Tree後，mean shape 加上所有經過的葉子結點的ΔS，即可得到最終的人臉關鍵點位置。總體流程如下圖所示：

總體流程圖

用公式來表示：
S^t+1=S^t+rt(I,S^t)

其中 t 表示級聯序號，rt(∙,∙) 表示當前級的迴歸器regressor。迴歸器的輸入引數為影象 I 和上一級迴歸器更新後的 shape , 採用的特徵可以是灰度值或者其它。

為了訓練每一級的 rt，文章採用了 gradient tree boosting演算法減小 initial shape 和 ground truth 的平方誤差總和。

每個迴歸器由很多棵樹(tree)組成，每棵樹引數是根據 current shape 和 ground truth 的座標差和隨機挑選的畫素對訓練得到的。具體演算法可詳見論文（我還沒有完全理解清楚）。

總結

跑過dlib程式碼後，使用預設引數，基本能夠重現論文結果。不像LBF演算法，無論怎麼調參也很難達到論文中的結果。

速度跟LBF是一個數量級，精度也與SDM稍好一些，缺點就是模型稍大了些（基於隨機樹的演算法似乎都是這樣）。

後續還要進一步研究演算法和論文。

【論文筆記】One Millisecond Face Alignment with an Ensemble of Regression Trees

簡介

演算法介紹

總結

【論文筆記】One Millisecond Face Alignment with an Ensemble of Regression Trees

One Millisecond Face alignment with an Ensemble of Regression Trees

【論文筆記】T Test

Reading Wikipedia to Answer Open-Domain Questions【論文筆記】

Semantic Parsing via Staged Query Graph Generation: Question Answering with Knowledge Base【論文筆記】

Question Answering over Freebase with Multi-Column Convolutional Neural Networks【論文筆記】

Context-Aware Basic Level Concepts Detection in Folksonomies【論文筆記】

Question Answering with Subgraph Embeddings【論文筆記】

Information Extraction over Structured Data: Question Answering with Freebase【論文筆記】

Semantic Parsing on Freebase from Question-Answer Pairs【論文筆記】

vggface2人臉識別資料集【論文筆記】VGGFace2——一個能夠用於識別不同姿態和年齡人臉的資料集

【論文筆記】使用多流密集網路的密度感知單影象去雨

【論文筆記】用形狀做擋風玻璃上的雨滴檢測《Detection Of Raindrop With Various Shapes On A Windshield》

【論文筆記】光流在視訊行為識別中的作用

【論文筆記】Margin Sample Mining Loss: A Deep Learning Based Method for Person Re-identification

【論文筆記】Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

【論文筆記】視訊物體檢測(VID)系列 NoScope:1000x的視訊檢索加速演算法

【論文筆記】In Defense of the Triplet Loss for Person Re-Identification

【論文筆記】視訊物體檢測(VID)系列 FGFA：Flow-Guided Feature Aggregation for Video Object Detection

【論文筆記】Reaching agreement in the presence of faults (EIG)

【論文筆記】One Millisecond Face Alignment with an Ensemble of Regression Trees

簡介

演算法介紹

總結

相關推薦