Inicio  /  Applied Sciences  /  Vol: 12 Par: 20 (2022)  /  Artículo
ARTÍCULO
TITULO

Rethinking Learnable Proposals for Graphical Object Detection in Scanned Document Images

Sankalp Sinha    
Khurram Azeem Hashmi    
Alain Pagani    
Marcus Liwicki    
Didier Stricker and Muhammad Zeshan Afzal    

Resumen

In the age of deep learning, researchers have looked at domain adaptation under the pre-training and fine-tuning paradigm to leverage the gains in the natural image domain. These backbones and subsequent networks are designed for object detection in the natural image domain. They do not consider some of the critical characteristics of document images. Document images are sparse in contextual information, and the graphical page objects are logically clustered. This paper investigates the effectiveness of deep and robust backbones in the document image domain. Further, it explores the idea of learnable object proposals through Sparse R-CNN. This paper shows that simple domain adaptation of top-performing object detectors to the document image domain does not lead to better results. Furthermore, empirically showing that detectors based on dense object priors like Faster R-CNN, Mask R-CNN, and Cascade Mask R-CNN are perhaps not best suited for graphical page object detection. Detectors that reduce the number of object candidates while making them learnable are a step towards a better approach. We formulate and evaluate the Sparse R-CNN (SR-CNN) model on the IIIT-AR-13k, PubLayNet, and DocBank datasets and hope to inspire a rethinking of object proposals in the domain of graphical page object detection.

 Artículos similares

       
 
Dominik Bachmann, Rolf Brönnimann, Luis Nicklaus Caceres, Sofie L. Gnannt, Erwin Hack, Elena Mavrona, Daniel Sacré and Peter Zolliker    
THz-Time domain spectroscopic imaging is demonstrated combining a robotic scanning method with continuous signal acquisition and holographic reconstruction of the object to improve the imaging resolution. We apply the method to a metallic Siemens star in... ver más
Revista: Applied Sciences

 
Zhigang Song, Daisong Li, Zhongyou Chen and Wenqin Yang    
The unsupervised domain-adaptive vehicle re-identification approach aims to transfer knowledge from a labeled source domain to an unlabeled target domain; however, there are knowledge differences between the target domain and the source domain. To mitiga... ver más
Revista: Applied Sciences

 
Yisha Wang, Gang Yang and Hao Lu    
Rapid and accurate tree-crown detection is significant to forestry management and precision forestry. In the past few decades, the development and maturity of remote sensing technology has created more convenience for tree-crown detection and planting ma... ver más
Revista: Algorithms

 
Yuri Kuzmin and Stanislav Proshkin    
Based on a rigorous solution to the problem, analytical expressions are obtained for calculating the diffraction of the electromagnetic field of a grounded cable on an elongated dielectric spheroid in a conductive layer. The field of a grounded AC cable ... ver más
Revista: Applied Sciences

 
Jie Wang, Hai Lin, Huaihai Guo, Qi Zhang and Junxiang Ge    
The characterization of targets by electromagnetic (EM) scattering and underwater acoustic scattering is an important object of research in these two related fields. However, there are some difficulties in the simulation and measurement of the scattering... ver más