論文筆記：Feature Pyramid Networks for Object Detection

阿新 • • 發佈：2018-12-30

初衷

Feature pyramids are a basic component in recognition systems for detecting objects at different scales. But recent deep learning object detectors have avoided pyramid representations, in part because they are compute and memory intensive. In this paper, we exploit the inherent multi-scale,pyramidal hierarchy of deep convolutional networks to construct feature pyramids with marginal extra cost.

簡而言之就是特徵金字塔一直很好用，只是費時費力，我們提出了一直很好的方法可以高效地使用特徵金字塔。

介紹

一些用feature pyramid的方法：

用圖片金字塔生成特徵金字塔
只在特徵最上層預測
特徵層分層預測
FPN從高層攜帶資訊傳給底層，再分層預測

We rely on an architecture that combines low-resolution, semantically strong features with high-resolution, semantically weak features via a top-down pathway and lateral connections .

作者闡釋他這樣做的好處，就是融合底層細節和高層概括。

Similar architectures adopting top-down and skip connections are popular in recent research. Their goals are to produce a single high-level feature map of a fine resolution, on which the predictions are to be made.

很多人多想到過融合，但是他們沒有分層預測。。。。

Feature Pyramid Networks

目標

Our method takes a single-scale image of an arbitrary size as input, and outputs proportionally sized feature maps at multiple levels, in a fully convolutional fashion.

實現

Bottom-up pathway

Specifically, for ResNets we use the feature activations output by each stage’s last residual block. We denote the output of these last residual blocks as {C 2 , C 3 , C 4 , C 5 } for conv2, conv3, conv4, and conv5 outputs, and note that they have strides of {4, 8, 16, 32} pixels with respect to the input image. We do not include conv1 into the pyramid due to its large memory footprint.

自底向上的過程就是神經網路普通的正向傳播。考慮到後的步驟，要逐層抽取特徵。那些feature map大小相同的地方，只抽取最上層。在試驗中，用的ResNets的卷積層，分為了{C1，C2，C3，C4，C5}用了C2，C3，C4，C5的feature map。不用C1，因為太大了。

Top-down pathway and lateral connections

這裡寫圖片描述

With a coarser-resolution feature map, we upsample the spatial resolution by a factor of 2 (using nearest neighbor upsampling for simplicity). The upsampled map is then merged with the corresponding bottom-up map (which undergoes a 1×1 convolutional layer to reduce channel dimensions) by element-wise addition.

上層feature map上取樣和下層一樣大，下層的經過1×1卷積核使得維度和上層一樣，之後按元素相加。

To start the iteration, we simply attach a 1×1 convolutional layer on C 5 to produce the coarsest resolution map.

為了開始這個操作，我們的最上面的feature map是通過1×1的卷積核生成的。

Finally, we append a 3×3 convolution on each merged map to generate the final feature map, which is to reduce the aliasing effect of upsampling. This final set of feature maps is called {P 2 , P 3 , P 4 , P 5 }, corresponding to {C 2 , C 3 , C 4 , C 5 } that are respectively of the same spatial sizes.

為了減少混疊效應帶來的影響，每層還經過一個3×3的卷積核生成最後的feature map，記作{P 2 , P 3 , P 4 , P 5 }。

應用

RPN

We adapt RPN by replacing the single-scale feature map with our FPN. We attach a head of the same design (3×3 conv and two sibling 1×1 convs) to each level on our feature pyramid.

將原來的小網路接在每一層上。引數共享

Because the head slides densely over all locations in all pyramid levels, it is not necessary to have multi-scale anchors on a specific level. Instead, we assign anchors of a single scale to each level. Formally, we define the anchors to have areas of {322,642,1282,2562,5122 } pixels on {P 2 , P 3 , P 4 , P 5 , P 6 } respectively. 1 As in [28] we also use anchors of multiple aspect ratios {1:2, 1:1, 2:1} at each level. So in total there are 15 anchors over the pyramid.

anchor 的長寬比就不用改了，因為每層都不一樣大。其實這樣有15anchors，比之前的9個還要多呢。

Fast R-CNN

選取合適大小的feature map。通過下面的式子計算用哪一層的特徵：

k=⌊k0+log2(wh‾‾‾√/224)⌋
k0是開始設定的，就是沒用這種方法的時候該用哪層。

論文筆記：Feature Pyramid Networks for Object Detection

初衷 Feature pyramids are a basic component in recognition systems for detecting objects at different scales. But recent deep

論文閱讀筆記（二十二）：Feature Pyramid Networks for Object Detection（FPN）

Feature pyramids are a basic component in recognition systems for detecting objects at different scales. But recent deep learning o

論文閱讀 | FPN：Feature Pyramid Networks for Object Detection

語義 alt bubuko 獨立 margin dual eat 方法神經網絡論文地址：https://arxiv.org/pdf/1612.03144v2.pdf 代碼地址：https://github.com/unsky/FPN 概述 FPN是FAIR發表在CV

特徵金字塔特徵用於目標檢測：Feature Pyramid Networks for Object Detection

前言：這篇論文主要使用特徵金字塔網路來融合多層特徵，改進了CNN特徵提取。作者也在流行的Fast&Faster R-CNN上進行了實驗，在COCO資料集上測試的結果現在排名第一，其中隱含的說明了其在小目標檢測上取得了很大的進步。其實整體思想比較簡單，但是實驗部分

論文解讀之Feature Pyramid Networks for Object Detection

論文名稱：Feature Pyramid Networks for Object Detection 這是一篇CVPR2017的文章，提出一種新型的特徵金字塔網路，作者是何開明等人首先，文章介

Feature Pyramid Networks for Object Detection 論文筆記

版權宣告：本文為博主原創文章，未經博主允許不得轉載。 https://blog.csdn.net/Jesse_Mx/article/details/54588085 論文地址：Feature Pyramid Networks for Object Detection 前言這篇論文主要使

Feature Pyramid Networks for Object Detection論文筆記

1、摘要 Feature pyramids are a basic component in recognition systems for detecting objects at diferent scales.But recent deep learning object detector

【Network Architecture】Feature Pyramid Networks for Object Detection(FPN)論文解析（轉）

目錄 0. 前言 1. 部落格一 2.。部落格二 0. 前言這篇論文提出了一種新的特徵融合方式來解決多尺度問題，感覺挺有創新性的，如果需要與其他網路進行拼接，還是需要再回到原文看一下細節。這裡轉了兩篇比較好的部落格作為備忘。 1. 部落格一這篇論文是CVPR20

Feature Pyramid Networks for Object Detection論文研讀與問題討論

Feature Pyramid Networks for Object Detection論文研讀與問題討論前言（1）背景介紹 Featurized Image Pyramids Fast & Faster R-CNN

Feature Pyramid Networks for Object Detection論文翻譯——中文版

文章作者：Tyan 部落格：noahsnail.com | CSDN | 簡書宣告：作者翻譯論文僅為學習，如有侵權請聯絡作者刪除博文，謝謝！ Feature Pyramid Networks for Object Detection

Feature Pyramid Networks for Object Detection 總結

最近在閱讀FPN for object detection,看了網上的很多資料，有些認識是有問題的，當然有些很有價值。下面我自己總結了一下，以供參考。 1. FPN解決了什麼問題？答：在以往的faster rcnn進行目標檢測時，無論是rpn還是fast rcnn，roi 都作用在最後一

論文筆記：Learning Region Features for Object Detection

中心思想繼Relation Network實現可學習的nms之後，MSRA的大佬們覺得目標檢測器依然不夠fully learnable，這篇文章類似之前的Deformable ROI Pooling，主要在ROI特徵的組織上做文章，文章總結了現有的各種ROI Pooling變體，提出了一個統一的數學表示式