• Volume 44 Issue 8
    Aug.  2022
    Turn off MathJax
    Article Contents
    YUAN Li, XIA Tong, ZHANG Xiao-shuang. Physiological curve extraction of the human ear based on the improved YOLACT[J]. Chinese Journal of Engineering, 2022, 44(8): 1386-1395. doi: 10.13374/j.issn2095-9389.2021.01.11.005
    Citation: YUAN Li, XIA Tong, ZHANG Xiao-shuang. Physiological curve extraction of the human ear based on the improved YOLACT[J]. Chinese Journal of Engineering, 2022, 44(8): 1386-1395. doi: 10.13374/j.issn2095-9389.2021.01.11.005

    Physiological curve extraction of the human ear based on the improved YOLACT

    doi: 10.13374/j.issn2095-9389.2021.01.11.005
    More Information
    • Corresponding author: E-mail: lyuan@ustb.edu.cn
    • Received Date: 2021-01-11
      Available Online: 2021-06-18
    • Publish Date: 2022-07-06
    • In related work, such as human ear shape clustering, three-dimensional human ear modeling, and personal customized headphones, the key physiological curves of the human ear and the accurate positions of key points need to be determined. Moreover, as an important biological feature, the morphological analysis and classification of the human ear are of considerable value for medical work related to the human ear. However, because of the complex morphological structure of the human ear, the generation of a general standard for the morphological structure of the human ear is difficult. This study divided the morphological structure of the human ear into three regions, namely, helix, antihelix, and concha, for instance segmentation and key physiological curve extraction. Traditional edge extraction methods are sensitive to illumination and posture variations. Moreover, the color distribution of one human ear image is relatively consistent. Thus, the transition among the three regions may not be obvious, which will cause poor adaptability for traditional edge extraction methods when extracting the key physiological curves of the human ear. To address this problem, this study proposed an improved YOLACT(You Only Look At CoefficienTs) instance segmentation model based on the ResNeSt backbone and the “screening mask” strategy, which improves the original YOLACT model from two aspects, namely, localization and segmentation. Our ResNeSt-based YOLACT model was trained with labeled ear images from the USTB-Helloear image set. In the prediction stage, the original cropping mask strategy was discarded and replaced with our proposed screening mask strategy to ensure the integrity of the edges of the segmentation area. These improvements enhance the accuracy of curve detection and extraction and can accurately segment different regions of the human ear and extract key physiological curves. Compared with other methods, our proposed method shows better segmentation accuracy on the test image set and is more robust to posture variations of the human ear.

       

    • loading
    • [1]
      楊月如, 吳紅斌. 耳廓的解剖學研究. 解剖學雜志, 1988, 11(1):56

      Yang Y R, Wu H B. Anatomical study of auricle. Chin J Anat, 1988, 11(1): 56
      [2]
      齊娜, 李莉, 趙偉. 中國成年人耳廓形態測量及分類. 聲學技術, 2010, 29(5):518

      Qi N, Li L, Zhao W. Morphometry and classification of Chinese adult's auricles. Tech Acoust, 2010, 29(5): 518
      [3]
      Azaria R, Adler N, Silfen R, et al. Morphometry of the adult human earlobe: A study of 547 subjects and clinical application. Plast Reconstr Surg, 2003, 111(7): 2398 doi: 10.1097/01.PRS.0000060995.99380.DE
      [4]
      Ronneberger O, Fischer P, Brox T. U-net: Convolutional networks for biomedical image segmentation // Lecture Notes in Computer Science. Munich, 2015: 234
      [5]
      Milletari F, Navab N, Ahmadi S A. V-net: Fully convolutional neural networks for volumetric medical image segmentation // 2016 Fourth International Conference on 3D Vision (3DV). Stanford, 2016: 565
      [6]
      Noh H, Hong S, Han B. Learning deconvolution network for semantic segmentation // 2015 IEEE International Conference on Computer Vision (ICCV). Santiago, 2015: 1520
      [7]
      Zhao H S, Shi J P, Qi X J, et al. Pyramid scene parsing network // 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, 2017: 6230
      [8]
      Chen L C, Papandreou G, Schroff F, et al. Rethinking atrous convolution for semantic image segmentation // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Long Beach, 2017: 2536
      [9]
      王志明, 劉志輝, 黃洋科, 等. 基于深度學習的高效火車號識別. 工程科學學報, 2020, 42(11):1525

      Wang Z M, Liu Z H, Huang Y K, et al. Efficient wagon number recognition based on deep learning. Chin J Eng, 2020, 42(11): 1525
      [10]
      Chen X L, Girshick R, He K M, et al. TensorMask: A foundation for dense object segmentation // 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul, 2019: 2061
      [11]
      Dai J F, He K M, Sun J. Instance-aware semantic segmentation via multi-task network cascades // 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, 2016: 3150
      [12]
      Li Y, Qi H Z, Dai J F, et al. Fully convolutional instance-aware semantic segmentation // 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, 2017: 4438
      [13]
      Bolya D, Zhou C, Xiao F Y, et al. YOLACT: real-time instance segmentation // 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul, 2019: 9156
      [14]
      He K M, Gkioxari G, Dollár P, et al. Mask R-CNN // 2017 IEEE International Conference on Computer Vision (ICCV). Venice, 2017: 2980
      [15]
      Ren S Q, He K M, Girshick R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell, 2017, 39(6): 1137 doi: 10.1109/TPAMI.2016.2577031
      [16]
      Cai Z W, Vasconcelos N. Cascade R-CNN: Delving into high quality object detection // 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, 2018: 6154
      [17]
      Qin Z, Li Z M, Zhang Z N, et al. ThunderNet: towards real-time generic object detection on mobile devices // 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul, 2019: 6717
      [18]
      He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition // 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, 2016: 770
      [19]
      Zhang H, Wu C R, Zhang Z Y, et al. ResNeSt: Split-attention networks [J/OL]. ArXiv Preprint (2020-04-19) [2020-12-31].https://arxiv.org/abs/2004.08955
      [20]
      Lin T Y, Dollár P, Girshick R, et al. Feature pyramid networks for object detection // 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, 2017: 936
      [21]
      Xie S N, Girshick R, Dollár P, et al. Aggregated residual transformations for deep neural networks // 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, 2017: 5987
      [22]
      Hu J, Shen L, Sun G. Squeeze-and-excitation networks // 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, 2018: 7132
      [23]
      Li X, Wang W H, Hu X L, et al. Selective kernel networks // 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, 2019: 510
      [24]
      He T, Zhang Z, Zhang H, et al. Bag of tricks for image classification with convolutional neural networks // 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, 2019: 558
      [25]
      Shelhamer E, Long J, Darrell T. Fully convolutional networks for semantic segmentation. IEEE Trans Pattern Anal Mach Intell, 2017, 39(4): 640 doi: 10.1109/TPAMI.2016.2572683
      [26]
      Liu W, Anguelov D, Erhan D, et al. SSD: single shot MultiBox detector // Computer Vision – ECCV 2016. Amsterdam, 2016: 21
      [27]
      Lin T Y, Maire M, Belongie S, et al. Microsoft COCO: common objects in context // Computer Vision – ECCV 2014. Zurich, 2014: 740
      [28]
      Zhang Y, Mu Z C, Yuan L, et al. USTB-helloear: A large database of ear images photographed under uncontrolled conditions // International Conference on Image and Graphics. Shanghai, 2017: 405
      [29]
      Yuan L, Zhao H N, Zhang Y, et al. Ear alignment based on convolutional neural network // Chinese Conference on Biometric Recognition. Urumqi, 2018: 562
    • 加載中

    Catalog

      通訊作者: 陳斌, bchen63@163.com
      • 1. 

        沈陽化工大學材料科學與工程學院 沈陽 110142

      1. 本站搜索
      2. 百度學術搜索
      3. 萬方數據庫搜索
      4. CNKI搜索

      Figures(10)  / Tables(6)

      Article views (695) PDF downloads(75) Cited by()
      Proportional views
      Related

      /

      DownLoad:  Full-Size Img  PowerPoint
      Return
      Return
      中文字幕在线观看