An Unsupervised Depth-Estimation Model for Monocular Images Based on Perceptual Image Error Assessment

Hyeseung Park and Seungchul Park

Resumen

In this paper, we propose a novel unsupervised learning-based model for estimating the depth of monocular images by integrating a simple ResNet-based auto-encoder and some special loss functions. We use only stereo images obtained from binocular cameras as training data without using depth ground-truth data. Our model basically outputs a disparity map that is necessary to warp an input image to an image corresponding to a different viewpoint. When the input image is warped using the output-disparity map, distortions of various patterns inevitably occur in the reconstructed image. During the training process, the occurrence frequency and size of these distortions gradually decrease, while the similarity between the reconstructed and target images increases, which proves that the accuracy of the predicted disparity maps also increases. Therefore, one of the important factors in this type of training is an efficient loss function that accurately measures how much the difference in quality between the reconstructed and target images is and guides the gap to be properly and quickly closed as the training progresses. In recent related studies, the photometric difference was calculated through simple methods such as L1 and L2 loss or by combining one of these with a traditional computer vision-based hand-coded image-quality assessment algorithm such as SSIM. However, these methods have limitations in modeling various patterns at the level of the human visual system. Therefore, the proposed model uses a pre-trained perceptual image-quality assessment model that effectively mimics human-perception mechanisms to measure the quality of distorted images as image-reconstruction loss. In order to highlight the performance of the proposed loss functions, a simple ResNet50-based network is adopted in our model. We trained our model using stereo images of the KITTI 2015 driving dataset to measure the pixel-level depth for 768 � 384 images. Despite the simplicity of the network structure, thanks to the effectiveness of the proposed image-reconstruction loss, our model outperformed other state-of-the-art studies that have been trained in unsupervised methods on a variety of evaluation indicators.

Palabras claves

monocular depth estimation - perceptual image-quality assessment - PieAPP - KITTI

Acceso

P�GINAS

pp. 0 - 0

N�MERO

Volumen: 12 Parte: 17 (2022)

MATERIAS

INGENIER�A Y CONSTRUCCI�N CIVIL
TECNOLOG�A

REVISTAS SIMILARES

Water
Journal of Science and Applicative Technology
Acta Scientiarum: Technology

DOI

https://doi.org/10.3390/app12178829

Art�culos similares

GDUI: Guided Diffusion Model for Unlabeled Images

Acceso

Xuanyuan Xie and Jieyu Zhao

The diffusion model has made progress in the field of image synthesis, especially in the area of conditional image synthesis. However, this improvement is highly dependent on large annotated datasets. To tackle this challenge, we present the Guided Diffu... ver m�s

Revista: Algorithms

Automated Brain Tumor Identification in Biomedical Radiology Images: A Multi-Model Ensemble Deep Learning Approach

Acceso

Sarfaraz Natha, Umme Laila, Ibrahim Ahmed Gashim, Khalid Mahboob, Muhammad Noman Saeed and Khaled Mohammed Noaman

Brain tumors (BT) represent a severe and potentially life-threatening cancer. Failing to promptly diagnose these tumors can significantly shorten a person?s life. Therefore, early and accurate detection of brain tumors is essential, allowing for appropri... ver m�s

Revista: Applied Sciences

Identification of the Surface Cracks of Concrete Based on ResNet-18 Depth Residual Network

Acceso

Rong Wang, Xinyang Zhou, Yi Liu, Dongqi Liu, Yu Lu and Miao Su

To ensure the safety and durability of concrete structures, timely detection and classification of concrete cracks using a low-cost and high-efficiency method is necessary. In this study, a concrete surface crack damage detection method based on the ResN... ver m�s

Revista: Applied Sciences

Evaluation Model of Rice Seedling Production Line Seeding Quality Based on Deep Learning

Acceso

Yongbo Liu, Peng He, Yan Cao, Conghua Zhu and Shitao Ding

A critical precondition for realizing mechanized transplantation in rice cultivation is the implementation of seedling tray techniques. To augment the efficacy of seeding, a precise evaluation of the quality of rice seedling cultivation in these trays is... ver m�s

Revista: Applied Sciences

GEA-MSNet: A Novel Model for Segmenting Remote Sensing Images of Lakes Based on the Global Efficient Attention Module and Multi-Scale Feature Extraction

Acceso

Qiyan Li, Zhi Weng, Zhiqiang Zheng and Lixin Wang

The decrease in lake area has garnered significant attention within the global ecological community, prompting extensive research in remote sensing and computer vision to accurately segment lake areas from satellite images. However, existing image segmen... ver m�s

Revista: Applied Sciences

Revistas destacadas

Acceso directo a los n�meros publicados en la revista Infrastructures

Infrastructures

Acceso directo a los n�meros publicados en la revista Informed Infraestructure

Informed Infraestructure

Acceso directo a los n�meros publicados en la revista BiT

Acceso directo a los n�meros publicados en la revista Revista de la Construcci�n

Revista de la Construcci�n

Ver todas las revistas