Submitted:
19 June 2023
Posted:
19 June 2023
Read the latest preprint version here
Abstract
Keywords:
1. Introduction
2. Problem Statement
- What are the classes of perceived entities?
- What are the identities of perceived entities?
- Where are the locations of perceived entities inside related images?
- Where are the locations of perceived entities inside scenes?
- How to determine the presence of an entity in the left camera’s image plane?
- How to find the match in the right camera’s image plane if an entity has been detected in the left camera’s image plane?
3. Similar Works on Stereo Matching
- Methods which make the attempt of matching points within a pair of stereo images [14].
- Methods which make the attempt of matching edges or contours within a pair of stereo images [15].
- Methods which make the attempt of matching line segments within a pair of stereo images [16].
- Methods which make the attempt of matching curves within a pair of stereo images [17].
- Methods which make the attempt of matching regions within a pair of stereo images [18].
- Methods which make the attempts of matching objects within a pair of stereo images [19].
4. The Outline of Proposed Principle
- Image acquisition by both cameras.
- Image sampling on video stream from left camera.
- Hybrid feature extraction for each image sample.
- Cognition of image samples if they correspond to the training data of reference entities inside training images.
- Recognition of image samples if they correspond to the possible occurrences of reference entities inside real-time images.
- Forward/Inverse processes of template matching, which work together so as to find the occurrence of matched candidate in the right image, if a recognized entity is present in the left image.
5. Top-Down Strategy of Doing Image Sampling
- It is difficult to determine, or to justify, the size of sub-window which is used to scan an input image. If the size of sub-window is allowed to be dynamically changed, then the next question is how to do such dynamic adjustment of sizes.
- The number of obtained image samples is independent of the content inside an input image. For example, an input image may contain a single entity. In this case, the scanning method will still produce many image samples which will be the input to subsequent visual processes of classification, identification, and grouping, etc. Obviously, irrelevant image samples may potentially cause troubles to these visual processes of recognition.
- with one sample, then and .
- with two samples, then and .
- with three samples, then and .
- with four samples, then and .
- and so on.
6. Feature Extraction from Sample Image in Time-Domain
- The mean value of approximate electromagnetic energy:
- The square-root of the variance of approximate electromagnetic energy:
- The horizontal distribution of approximate electromagnetic energy:with:and
- The vertical distribution of approximate electromagnetic energy:
7. Feature Extraction from Sample Image in Frequency-Domain
8. Cognition Process Using RCE Neural Network
9. Recognition Process Using Possibility Function
10. Forward/Inverse Processes of Template Matching
- Determine the equation of epipolar line from both stereovision’s calibration parameters (NOTE: such knowledge could be found in any textbook of computer vision) and location a’s coordinates.
- Scan the epipolar line location by location.
- Take image sample at currently scanned location e.
- Compute the feature vector of image sample .
- Compute the cosine distance between image sample ’s feature vector and image sample ’s feature vector.
- Repeat the scanning until it is completed.
- Choose the image sample to be the candidate of matched sample if it minimizes the cosine distance.
- Use the cosine distance between recognized sample and the chosen candidate of matched sample to compute the possibility value of match (i.e., to use Equation (18)).
- Accept matched sample if the possibility value of match is greater than a chosen threshold value (e.g., 0.5).
- Determine the equation of epipolar line from both the stereovision’s calibration parameters and the location a’s coordinates.
- Scan the epipolar line location by location.
- Take image sample at currently scanned location e.
- Divide image sample e into a matrix of sub-samples .
- Use each sub-sample in as template and do forward template matching with recognized sample .
- Compute the mean value of all the possibility values which measure the match between all the sub-samples in and recognized sample . This mean value represents the possibility value for image sample in right image to match with recognized sample in left image.
- Repeat the scanning until it is completed.
- Choose the image sample to be the candidate of matched sample if it minimizes the possibility values of match (i.e., calculated by Equation (21)).
- Accept the match if the possibility value of match is greater than a chosen threshold value (e.g., 0.5).
11. Implementation and Results
11.1. Results of Top-down Sampling Strategy of Input Images
11.2. Examples of Training Data for Cognition (i.e., Learning)
11.3. Results of Feature Extraction in Time Domain
11.4. Results of Feature Extraction in Frequency Domain
11.5. Results of Cognition
11.6. Results of Recognition
11.7. Results of Stereo Matching
12. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Xie, M.; Hu, Z.C.; Chen, H. New Foundation of Artificial Intelligence. World Scientific, 2021. [Google Scholar]
- Cassinelli, A.; Reynolds, C. and Ishikawa, M. Augmenting Spatial Awareness with Haptic Radar. IEEE International Symposium on Wearable Computers, 2006, pp. 61–64.
- Li, Y.; Ibanez-Guzman, J. Lidar for Autonomous Driving: The Principles, Challenges, and Trends for Automotive Lidar and Perception Systems, IEEE Signal Process. Mag. 2020, 37, 50–61. [Google Scholar]
- Rashidi, A.; Fathi, H.; Brilakis, I. Innovative Stereo Vision-Based Approach to Generate Dense Depth Map of Transportation Infrastructure. Transp. Res. Rec. 2011, 2215, 93–99. [Google Scholar] [CrossRef]
- Xie, M. Key Steps Toward Development of Humanoid Robots. In 25th International Conference on Climbing and Walking Robots, Robotics in Natural Settings, Lecture Notes in Networks and Systems; Springer, 2022. [Google Scholar]
- Xie, M.; Velamala, S. Maritime Autonomous Vessels: A Review of RobotX Challenge’s Works. J. Technol. Soc. Sci. 2018, 2, 7–14. [Google Scholar]
- Gordon, I.E. Theories of Visual Perception, 3rd ed.; Psychology Press, 2004. [Google Scholar]
- Bekey, G.A. Autonomous Robots: From Biological Inspiration to Implementation and Control; The MIT Press, 2005. [Google Scholar]
- Roberts, D.A.; Yaida, S. The Principles of Deep Learning Theory; The Cambridge University Press, 2022. [Google Scholar]
- Wu, X.W.; Sahoo, D.; Hoi, S.C.H. Recent advances in deep learning for object detection. Neurocomputing 2020, 396, 39–64. [Google Scholar] [CrossRef]
- Rogister, P.; Benosman, P.; Ieng, R.; Lichtsteiner, S.H.; Delbruck, T. Asynchronous Event-Based Binocular Stereo Matching. IEEE Trans. Neural Netw. Learn. Syst. 2012, 23, 347–353. [Google Scholar] [CrossRef] [PubMed]
- Yang, G.S.; Manela, J.; Happold, M.; Ramanan, D. Hierarchical Deep Stereo Matching on High-Resolution Images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); 2019; pp. 5515–5524. [Google Scholar]
- Bleyer, M.; Breiteneder, C. Stereo Matching: State-of-the-Art and Research Challenges. In Edited Book of Advances in Computer Vision and Pattern Recognition; Springer, 2013; pp. 143–179. [Google Scholar]
- Chang, J.R.; Chen, Y.S. Pyramid Stereo Matching Network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR); 2018; pp. 5410–5418. [Google Scholar]
- Xie, M. A Cooperative Strategy for The Matching of Multi-level Edge Primitives. Image Vis. Comput. 1995, 13, 89–99. [Google Scholar] [CrossRef]
- Medioni, G.; Nevatia, R. Segment-based stereo matching. Comput. Vis. Graph. Image Process. 1985, 31, 2–18. [Google Scholar] [CrossRef]
- Zhang, Y.N.; Gerbrands, J.J. Method for matching general stereo planar curves. Image Vis. Comput. 1995, 13, 645–655. [Google Scholar] [CrossRef]
- Wang, Z.F.; Zhi-Gang Zheng, Z.G. A region based stereo matching algorithm using cooperative optimization. IEEE Conference on Computer Vision and Pattern Recognition, 2008, pp. 1–8.
- Li, L.; Fang, M.; Yin, Y.; Lian, J.; Wang, Z. A Traffic Scene Object Detection Method Combining Deep Learning and Stereo Vision Algorithm. In Proceedings of the IEEE International Conference on Real-Time Computing and Robotics (RCAR); 2021; pp. 1134–1138. [Google Scholar]
- Yin, X.M.; Guo, D.; Xie, M. Hand image segmentation using color and RCE neural network. Robot. Auton. Syst. 2001, 34, 235–250. [Google Scholar] [CrossRef]
- Cooper, P.W. The hypersphere in pattern recognition. Inf. Control 1962, 5, 324–346. [Google Scholar] [CrossRef]
- Morgan, D.P.; Scofield, C.L. ANN Keyword Recognition. In Neural Networks and Speech Processing; The Springer International Series in Engineering and Computer Science, 1991; Volume 130. [Google Scholar]
- Cooper, L.N. How We Remember: Toward an Understanding of Brain and Neural Systems; World Scientific, 1995. [Google Scholar]





















Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).