Physiological curve extraction of the human ear based on the improved YOLACT

YUAN Li; XIA Tong; ZHANG Xiao-shuang

doi:10.13374/j.issn2095-9389.2021.01.11.005

Volume 44 Issue 8

Aug. 2022

Turn off MathJax

Article Contents

Article Navigation > Chinese Journal of Engineering > 2022 > 44(8): 1386-1395

YUAN Li, XIA Tong, ZHANG Xiao-shuang. Physiological curve extraction of the human ear based on the improved YOLACT[J]. Chinese Journal of Engineering, 2022, 44(8): 1386-1395. doi: 10.13374/j.issn2095-9389.2021.01.11.005

Citation:

YUAN Li, XIA Tong, ZHANG Xiao-shuang. Physiological curve extraction of the human ear based on the improved YOLACT[J]. Chinese Journal of Engineering, 2022, 44(8): 1386-1395. doi: 10.13374/j.issn2095-9389.2021.01.11.005

Citation:

PDF( 5000 KB)

Physiological curve extraction of the human ear based on the improved YOLACT

doi: 10.13374/j.issn2095-9389.2021.01.11.005

School of Automation, University of Science and Technology Beijing, Beijing 100083, China

More Information

Corresponding author: E-mail: lyuan@ustb.edu.cn
Received Date: 2021-01-11
Available Online: 2021-06-18
Publish Date: 2022-07-06

Abstract

Abstract

In related work, such as human ear shape clustering, three-dimensional human ear modeling, and personal customized headphones, the key physiological curves of the human ear and the accurate positions of key points need to be determined. Moreover, as an important biological feature, the morphological analysis and classification of the human ear are of considerable value for medical work related to the human ear. However, because of the complex morphological structure of the human ear, the generation of a general standard for the morphological structure of the human ear is difficult. This study divided the morphological structure of the human ear into three regions, namely, helix, antihelix, and concha, for instance segmentation and key physiological curve extraction. Traditional edge extraction methods are sensitive to illumination and posture variations. Moreover, the color distribution of one human ear image is relatively consistent. Thus, the transition among the three regions may not be obvious, which will cause poor adaptability for traditional edge extraction methods when extracting the key physiological curves of the human ear. To address this problem, this study proposed an improved YOLACT（You Only Look At CoefficienTs） instance segmentation model based on the ResNeSt backbone and the “screening mask” strategy, which improves the original YOLACT model from two aspects, namely, localization and segmentation. Our ResNeSt-based YOLACT model was trained with labeled ear images from the USTB-Helloear image set. In the prediction stage, the original cropping mask strategy was discarded and replaced with our proposed screening mask strategy to ensure the integrity of the edges of the segmentation area. These improvements enhance the accuracy of curve detection and extraction and can accurately segment different regions of the human ear and extract key physiological curves. Compared with other methods, our proposed method shows better segmentation accuracy on the test image set and is more robust to posture variations of the human ear.
- human ear,
- physiological curve extraction,
- instance segmentation,
- improved YOLACT,
- ResNeSt

FullText(HTML)

References(29)

References

[1]	楊月如, 吳紅斌. 耳廓的解剖學研究. 解剖學雜志, 1988, 11(1):56 Yang Y R, Wu H B. Anatomical study of auricle. Chin J Anat, 1988, 11(1): 56
[2]	齊娜, 李莉, 趙偉. 中國成年人耳廓形態測量及分類. 聲學技術, 2010, 29(5):518 Qi N, Li L, Zhao W. Morphometry and classification of Chinese adult's auricles. Tech Acoust, 2010, 29(5): 518
[3]	Azaria R, Adler N, Silfen R, et al. Morphometry of the adult human earlobe: A study of 547 subjects and clinical application. Plast Reconstr Surg, 2003, 111(7): 2398 doi: 10.1097/01.PRS.0000060995.99380.DE
[4]	Ronneberger O, Fischer P, Brox T. U-net: Convolutional networks for biomedical image segmentation // Lecture Notes in Computer Science. Munich, 2015: 234
[5]	Milletari F, Navab N, Ahmadi S A. V-net: Fully convolutional neural networks for volumetric medical image segmentation // 2016 Fourth International Conference on 3D Vision (3DV). Stanford, 2016: 565
[6]	Noh H, Hong S, Han B. Learning deconvolution network for semantic segmentation // 2015 IEEE International Conference on Computer Vision (ICCV). Santiago, 2015: 1520
[7]	Zhao H S, Shi J P, Qi X J, et al. Pyramid scene parsing network // 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, 2017: 6230
[8]	Chen L C, Papandreou G, Schroff F, et al. Rethinking atrous convolution for semantic image segmentation // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Long Beach, 2017: 2536
[9]	王志明, 劉志輝, 黃洋科, 等. 基于深度學習的高效火車號識別. 工程科學學報, 2020, 42(11):1525 Wang Z M, Liu Z H, Huang Y K, et al. Efficient wagon number recognition based on deep learning. Chin J Eng, 2020, 42(11): 1525
[10]	Chen X L, Girshick R, He K M, et al. TensorMask: A foundation for dense object segmentation // 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul, 2019: 2061
[11]	Dai J F, He K M, Sun J. Instance-aware semantic segmentation via multi-task network cascades // 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, 2016: 3150
[12]	Li Y, Qi H Z, Dai J F, et al. Fully convolutional instance-aware semantic segmentation // 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, 2017: 4438
[13]	Bolya D, Zhou C, Xiao F Y, et al. YOLACT: real-time instance segmentation // 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul, 2019: 9156
[14]	He K M, Gkioxari G, Dollár P, et al. Mask R-CNN // 2017 IEEE International Conference on Computer Vision (ICCV). Venice, 2017: 2980
[15]	Ren S Q, He K M, Girshick R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell, 2017, 39(6): 1137 doi: 10.1109/TPAMI.2016.2577031
[16]	Cai Z W, Vasconcelos N. Cascade R-CNN: Delving into high quality object detection // 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, 2018: 6154
[17]	Qin Z, Li Z M, Zhang Z N, et al. ThunderNet: towards real-time generic object detection on mobile devices // 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul, 2019: 6717
[18]	He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition // 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, 2016: 770
[19]	Zhang H, Wu C R, Zhang Z Y, et al. ResNeSt: Split-attention networks [J/OL]. ArXiv Preprint (2020-04-19) [2020-12-31].https://arxiv.org/abs/2004.08955
[20]	Lin T Y, Dollár P, Girshick R, et al. Feature pyramid networks for object detection // 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, 2017: 936
[21]	Xie S N, Girshick R, Dollár P, et al. Aggregated residual transformations for deep neural networks // 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, 2017: 5987
[22]	Hu J, Shen L, Sun G. Squeeze-and-excitation networks // 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, 2018: 7132
[23]	Li X, Wang W H, Hu X L, et al. Selective kernel networks // 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, 2019: 510
[24]	He T, Zhang Z, Zhang H, et al. Bag of tricks for image classification with convolutional neural networks // 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, 2019: 558
[25]	Shelhamer E, Long J, Darrell T. Fully convolutional networks for semantic segmentation. IEEE Trans Pattern Anal Mach Intell, 2017, 39(4): 640 doi: 10.1109/TPAMI.2016.2572683
[26]	Liu W, Anguelov D, Erhan D, et al. SSD: single shot MultiBox detector // Computer Vision – ECCV 2016. Amsterdam, 2016: 21
[27]	Lin T Y, Maire M, Belongie S, et al. Microsoft COCO: common objects in context // Computer Vision – ECCV 2014. Zurich, 2014: 740
[28]	Zhang Y, Mu Z C, Yuan L, et al. USTB-helloear: A large database of ear images photographed under uncontrolled conditions // International Conference on Image and Graphics. Shanghai, 2017: 405
[29]	Yuan L, Zhao H N, Zhang Y, et al. Ear alignment based on convolutional neural network // Chinese Conference on Biometric Recognition. Urumqi, 2018: 562