原论文:Progressive Semantic Segmentation
问题描述
当对大型图片进行语义分割时,可能会导致显存炸掉。收到内存限制,可以选择下采样,或将图像划分为局部块。但前者会丢失细节,后者会却反全局视图。
后处理改善分割细节
经典方法
条件随机场(CRF),引导滤波器(GF),两个速度慢,改进是渐进的。
深度学习的引导过滤器(DGF)可以提高推理速度
原论文:Progressive Semantic Segmentation
当对大型图片进行语义分割时,可能会导致显存炸掉。收到内存限制,可以选择下采样,或将图像划分为局部块。但前者会丢失细节,后者会却反全局视图。
条件随机场(CRF),引导滤波器(GF),两个速度慢,改进是渐进的。
深度学习的引导过滤器(DGF)可以提高推理速度
原论文Feature Pyramid Networks for Object Detection。
这篇论文就是大家熟知的FPN了。FPN是比较早期的一份工作(请注意,这篇论文只是多尺度特征融合的一种方式。不过这篇论文提出的比较早(CVPR2017),在当时看来是非常先进的),在当时具有很多亮点:FPN主要解决的是物体检测中的多尺度问题,通过简单的网络连接改变,在基本不增加原有模型计算量情况下,大幅度提升了小物体检测的性能。
Feature pyramids are a basic component in recognition systems for detecting objects at different scales. But recent deep learning object detectors have avoided pyramid representations, in part because they are compute and memory intensive. In this paper, we exploit the inherent multi-scale, pyramidal hierarchy of deep convolutional networks to construct feature pyramids with marginal extra cost. A top-down architecture with lateral connections is developed for building high-level semantic feature maps at all scales. This architecture, called a Feature Pyramid Network (FPN), shows significant improvement as a generic feature extractor in several applications. Using FPN in a basic Faster R-CNN system, our method achieves state-of-the-art single-model results on the COCO detection benchmark without bells and whistles, surpassing all existing single-model entries including those from the COCO 2016 challenge winners. In addition, our method can run at 5 FPS on a GPU and thus is a practical and accurate solution to multi-scale object detection. Code will be made publicly available.
这篇论文对以后的许多网络设计产生了较大的影响,推荐你阅读原文。这里只是对这篇论文的粗浅阅读笔记。
论文名称: MSR-net:Low-light Image Enhancement Using Deep Convolutional Network
论文作者: Liang Shen, Zihan Y ue, Fan Feng, Quan Chen, Shihao Liu, Jie Ma
Code: None
这是一篇讲解基于Retinex理论使用卷积神经网络进行低照度增强的论文。
Images captured in low-light conditions usually suffer from very low contrast, which increases the difficulty of sub-sequent computer vision tasks in a great extent. In this paper, a low-light image enhancement model based on convolutional neural network and Retinex theory is proposed. Firstly, we show that multi-scale Retinex is equivalent to a feedforward convolutional neural network with different Gaussian convolution kernels. Motivated by this fact, we consider a Convolutional Neural Network(MSR-net) that directly learns an end-to-end mapping between dark and bright images. Different fundamentally from existing approaches, low-light image enhancement in this paper is regarded as a machine learning problem. In this model, most of the parameters are optimized by back-propagation, while the parameters of traditional models depend on the artificial setting. Experiments on a number of challenging images reveal the advantages of our method in comparison with other state-of-the-art methods from the qualitative and quantitative perspective.
本文提出了一种基于卷积神经网络和视网膜理论(Retinex Theory)的低照度图像增强模型。证明了多尺度视网膜等价于一个具有不同高斯卷积核的前馈卷积神经网络。考虑一种卷积神经网络(MSR网络),它直接学习暗图像和亮图像之间的端到端映射。
论文名称: LLCNN: A convolutional neural network for low-light image enhancement
论文作者: Li Tao, Chuang Zhu, Guoqing Xiang, Yuan Li, Huizhu Jia, Xiaodong Xie
这是一篇讲解使用卷积神经网络进行低照度增强的论文。
In this paper, we propose a CNN based method to perform low-light image enhancement. We design a special module to utilize multiscale feature maps, which can avoid gradient vanishing problem as well. In order to preserve image textures as much as possible, we use SSIM loss to train our model. The contrast of low-light images can be adaptively enhanced using our method. Results demonstrate that our CNN based method outperforms other contrast enhancement methods.
本文提出了一种基于CNN的低照度图像增强方法。我们设计了一个特殊的模块来利用多尺度特征映射,这样可以避免梯度消失的问题。为了尽可能地保留图像纹理,我们使用SSIM损失来训练我们的模型。使用我们的方法可以自适应地增强弱光图像的对比度。
Qiang Chen, Yingming Wang, Tong Yang, Xiangyu Zhang, Jian Cheng, Jian Sun
This paper revisits feature pyramids networks (FPN) for one-stage detectors and points out that the success of FPN is due to its divide-and-conquer solution to the optimization problem in object detection rather than multi-scale feature fusion. From the perspective of optimization, we introduce an alternative way to address the problem instead of adopting the complex feature pyramids - {\em utilizing only one-level feature for detection}. Based on the simple and efficient solution, we present You Only Look One-level Feature (YOLOF). In our method, two key components, Dilated Encoder and Uniform Matching, are proposed and bring considerable improvements. Extensive experiments on the COCO benchmark prove the effectiveness of the proposed model. Our YOLOF achieves comparable results with its feature pyramids counterpart RetinaNet while being 2.5× faster. Without transformer layers, YOLOF can match the performance of DETR in a single-level feature manner with 7× less training epochs. With an image size of 608×608, YOLOF achieves 44.3 mAP running at 60 fps on 2080Ti, which is 13% faster than YOLOv4. Code is available at this https URL.
本文简称YOLOF。截至到本文写作时,二阶段和单阶段目标检测的SOTA方法中广泛使用了多尺度特征融合的方法。FPN方法几乎已经称为了网络中理所应当的一个组件。
本文中作者重新回顾了FPN模块,并指出FPN的两个优势分别是其分治(divide-and-conquer)的解决方案、以及多尺度特征融合。本文在单阶段目标检测器上研究了FPN的这两个优势,并在RetinaNet上进行了实验,将上述两个优势解耦,分别研究其发挥的作用,并指出,FPN在多尺度特征融合上发挥的作用可能没有想象中那么大。
最后,作者提出YOLOF,这是一个不使用FPN的目标检测网络。其主要创新是:
该网络在达到RetinaNet对等精度的情况下速度提升了2.5倍。