2.1.1. Dense Image Matching and Cost Computation

Boitumelo Ruf; Jonas Mohrs; Martin Weinmann; Stefan Hinz; Jürgen Beyerer

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

2.1.1. Dense Image Matching and Cost Computation

BR Boitumelo Ruf

JM Jonas Mohrs

MW Martin Weinmann

SH Stefan Hinz

JB Jürgen Beyerer

This method is extracted from research article: Sensors (Basel), Jun 2021

ReS2tAC—UAV-Borne Real-Time SGM Stereo Optimized for Embedded ARM and CUDA Devices

DOI: 10.3390/s21113938

Request a Protocol

Ask a question

Favorite

Finding two corresponding image points that depict the same scene point in both images of a stereo camera setup (Appendix A), means to match the pixels of the reference image, typically the image of the left camera, against each pixel in the matching image within a certain disparity range $d \in Γ = [d_{\min}, d_{\max}]$ . In this, the similarity between these the two pixels is modeled by a similarity measure from which a cost function $Φ$ can be deduced, which typically is at its minimum if both pixels coincide. This so-called matching cost between a pixel $p$ in the left image and a corresponding pixel $q$ , located according to a disparity shift d in the right image, is stored in the three-dimensional cost volume $S$ :

Thus, the objective in finding two corresponding image points is to minimize the matching cost computed by $Φ$ .

When relying on distinctive image features such as SIFT [34], SURF [35] or ORB [36] features, a unique matching between two image points can be found. However, such image features can only be calculated in descriptive image regions, resulting in a very sparse correspondence field. Thus, in the case of dense image matching Equation (1) is evaluated for each pixel in the reference image, computing a pixel-wise matching cost s according to $Φ$ , which indicates the similarity between the pixel pair. This cost is then stored inside a three-dimensional cost volume $S$ from which the disparity map is later extracted. In our work, we have implemented the two most commonly used similarity measures for real-time dense image matching, namely the Hamming distance of the non-parametric census transform (CT) [37] and the normalized cross-correlation (NCC). Since the Hamming distance of the CT is minimal when both image patches are most similar, it can directly be used as the matching cost $s_{CT}$ . The NCC, however, is equal to 1 when both image patches are equal. Thus, the NCC is inverted and truncated before being evaluated as matching cost: $s_{NCC} = 1 - max (0, Φ_{NCC})$ .

Since the disparity is only evaluated along the same pixel row (Equation (1)), it is assumed that the input images $I^{L}$ and $I^{R}$ are rectified prior to the process of image matching. Similar to the way described by Ruf et al. [19], we use a standard calibration routine implemented in the OpenCV library [38] to calculate two rectification maps, which allow efficient resampling of the input images such that the epipolar lines lie horizontally on the image rows.

Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

0/150

tip Tips for asking effective questions

+ Description

Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol