Version 1
: Received: 12 November 2018 / Approved: 13 November 2018 / Online: 13 November 2018 (12:57:10 CET)
How to cite:
Kubota, J.; Spaulding, R.; Zhu, C.; Izdebski, K.; Yan, Y. ROI Detection with Machine Learning for Glottis Images Captured from High-speed Video-endoscopy. Preprints2018, 2018110314. https://doi.org/10.20944/preprints201811.0314.v1
Kubota, J.; Spaulding, R.; Zhu, C.; Izdebski, K.; Yan, Y. ROI Detection with Machine Learning for Glottis Images Captured from High-speed Video-endoscopy. Preprints 2018, 2018110314. https://doi.org/10.20944/preprints201811.0314.v1
Kubota, J.; Spaulding, R.; Zhu, C.; Izdebski, K.; Yan, Y. ROI Detection with Machine Learning for Glottis Images Captured from High-speed Video-endoscopy. Preprints2018, 2018110314. https://doi.org/10.20944/preprints201811.0314.v1
APA Style
Kubota, J., Spaulding, R., Zhu, C., Izdebski, K., & Yan, Y. (2018). ROI Detection with Machine Learning for Glottis Images Captured from High-speed Video-endoscopy. Preprints. https://doi.org/10.20944/preprints201811.0314.v1
Chicago/Turabian Style
Kubota, J., Krzysztof Izdebski and Yuling Yan. 2018 "ROI Detection with Machine Learning for Glottis Images Captured from High-speed Video-endoscopy" Preprints. https://doi.org/10.20944/preprints201811.0314.v1
Abstract
Detection of the region of interest (ROI) is a critical step in laryngeal image analysis for the delineation of glottis contour. The process can improve both computational efficiency and accuracy of the image segmentation task, which will facilitate subsequent analysis and characterization of the vocal fold vibration as it correlates with voice quality and pathology. This study aims to develop machine learning based approaches for automatic detection of ROI for glottis image sequences captured by high-speed video-endoscopy (HSV), a clinical laryngeal imaging modality. In particular, we first applied the supporting vector machine (SVM) method using histogram of oriented gradients (HOG) feature descriptor, and second, trained a convolutional neural network (CNN) model for this task. Comparisons are made for both approaches in terms of accuracy of recognition and computation time.
Computer Science and Mathematics, Artificial Intelligence and Machine Learning
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.