基于深度學習的高效火車號識別

王志明; 劉志輝; 黃洋科; 邢宇翔

doi:10.13374/j.issn2095-9389.2019.12.05.001

基于深度學習的高效火車號識別

doi: 10.13374/j.issn2095-9389.2019.12.05.001

1.
北京科技大學計算機與通信工程學院，北京 100083
2.
清華大學工程物理系，北京 100084

詳細信息

通訊作者:
E-mail：wangzhiming@ustb.edu.cn

中圖分類號: TP391
計量
- 文章訪問數: 3040
- HTML全文瀏覽量: 1227
- PDF下載量: 133
- 被引次數: 0
出版歷程
- 收稿日期: 2019-12-05
- 刊出日期: 2020-11-25

Efficient wagon number recognition based on deep learning

1.
School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, China
2.
Department of Engineering Physics, Tsinghua University, Beijing 100084, China

More Information

Corresponding author: E-mail: wangzhiming@ustb.edu.cn

摘要

摘要: 基于高性能的YOLOv3目標檢測算法，提出一種分階段高效火車號識別算法。整個識別過程分為兩個階段：第一階段在低分辨率全局圖像中檢測出火車號區域位置；第二階段在局部高分辨率圖像中檢測出組成火車號的字符，根據字符的空間位置關系搜索得到12位火車號，并利用每個字符的識別置信度及火車號編碼規則進行校驗得到最終火車號。另外，本文提出一種結合批一化因子和濾波器相關度的剪枝算法，通過對兩個階段檢測模型的剪枝，在保證識別準確率不降（實驗中略有提升）的條件下降低了存儲空間占用率和計算復雜度。在現場采集的1072幅火車號圖像上的實驗結果表明，本文提出的火車號識別算法達到了96.92%的整車號識別正確率，平均識別時間僅為191 ms。
- 模式識別 /
- 火車號識別 /
- 深度學習 /
- 神經網絡 /
- 目標檢測 /
- 模型剪枝
Abstract: The automatic recognition of a wagon number plays an important role in railroad transportation systems. However, the wagon number character only occupies a very small area of the entire wagon image, and it is often accompanied by uneven illumination, a complex background, image contamination, and character stroke breakage, which makes the high-precision automatic recognition difficult. In recent years, object detection algorithm based on deep learning has made great progress, and it provides a solid technical basis for us to improve the performance of the train number recognition algorithm. This paper proposes a two-phase efficient wagon number recognition algorithm based on the high-performance YOLOv3 object detection algorithm. The entire recognition process is divided into two phases. In the first phase, the region of the wagon number in an image is detected from a low-resolution global image; in the second stage, the characters are detected in a high-resolution local image, formed into the wagon number according to their spatial position, and the final wagon number is obtained after verification based on the recognition confidence of each character and international wagon number coding rules. In addition, we proposed a new deep learning network-pruning algorithm based on the batch normalize scale factor and filter correlation. The importance of every filter was computed by considering the correlation between filter weights and the scale factor generated via batch normalization. By pruning and retraining the region detection model and character detection model, the storage space occupation and computational complexity were reduced without sacrificing recognition accuracy (which is even slightly improved in our experiment). Finally, we tested the proposed two-phase wagon number recognition algorithm on 1072 images from practical engineering application scenarios, and the results show that the proposed algorithm achieves 96.9% of the overall correct ratio (here, “correct” means all 12 characters are detected and recognized correctly), and the average recognition time is only 191 ms.
- pattern recognition /
- wagon number recognition /
- deep learning /
- neural network /
- object detection /
- model pruning

HTML全文

圖 1 火車號圖像示例

Figure 1. Example of a wagon number image

下載: 全尺寸圖片幻燈片

圖 2 火車號識別流程

Figure 2. Pipeline of the wagon number recognition

下載: 全尺寸圖片幻燈片

圖 3 火車號分布的熱力圖

Figure 3. Heatmap of the wagon number distribution

下載: 全尺寸圖片幻燈片

圖 4 火車號區域檢測圖像

Figure 4. Images for the wagon number region detection

下載: 全尺寸圖片幻燈片

圖 5 火車號區域檢測結果。（a）區域檢測結果；（b）提取到的局部區域圖

Figure 5. Wagon number region detection results: (a) region detection results; (b) extracted region image

下載: 全尺寸圖片幻燈片

圖 6 火車號字符檢測結果示例

Figure 6. Examples of wagon number character detection results

下載: 全尺寸圖片幻燈片

圖 7 每個字符對應的前8個最大概率類別、概率值及校驗糾錯位

Figure 7. Top 8 class and corresponding probabilities of every character and correction by verification

下載: 全尺寸圖片幻燈片

表 1 火車號區域檢測和字符檢測的實驗結果

Table 1. Results of wagon number region detection and character detection

Phase	Detectionmodel	Pruning	mAP/%	Model size/MB	Runtime memory/MB	Mean time/ms
Region detection	YOLOv3	N	95.31	241	1625	44.06
	YOLOv3	Y	95.37	93	1155	31.41
	Faster-RCNN		95.38	323	1121	103.16
	SSD		95.20	192	833	62.01
Character detection	YOLOv3	N	90.39	241	1307	29.22
	YOLOv3	Y	90.69	84	881	18.43
	Faster-RCNN		90.68	323	1161	110.09
	SSD		90.40	201	851	62.49

下載: 導出CSV

表 2 校驗和模型剪枝對識別結果的影響

Table 2. Influence of model pruning and verification on the recognition results

Verification & correction	Pruning	Accuracy rate/%	Error rate/%	Rejection rate/%	Mean time/ms
N	N	93.56	4.76	1.68	221
Y	N	96.36	1.96	1.68	224
Y	Y	96.92	2.15	0.93	191

下載: 導出CSV

中文字幕在线观看

參考文獻(26)

[1]	Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks // 26th Annual Conference on Neural Information Processing Systems. Lake Tahoe, 2012: 1097
[2]	LeCun Y, Bengio Y, Hinton G. Deep learning. Nature, 2015, 521: 436 doi: 10.1038/nature14539
[3]	Liao J. Research on recognition of railway wagon numbers based on deep convolutional neural networks. J Transp Eng Inf, 2016, 14(4): 64 doi: 10.3969/j.issn.1672-4747.2016.04.010 廖健. 基于深度卷積神經網絡的貨車車號識別研究. 交通運輸工程與信息學報, 2016, 14(4):64 doi: 10.3969/j.issn.1672-4747.2016.04.010
[4]	Li H, Wang P, You M Y, et al. Reading car license plates using deep neural networks. Image Vision Comput, 2018, 72: 14 doi: 10.1016/j.imavis.2018.02.002
[5]	Li H, Wang P, Shen C H. Toward end-to-end car license plate detection and recognition with deep neural networks. IEEE Trans Intell Transp Syst, 2019, 20(3): 1126 doi: 10.1109/TITS.2018.2847291
[6]	Montazzolli S, Jung C. Real-time Brazilian license plate detection and recognition using deep convolutional neural networks // 2017 30th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI). Niterói, 2017: 52
[7]	Laroca R, Severo E, Zanlorensi L A, et al. A robust real-time automatic license plate recognition based on the YOLO detector // 2018 International Joint Conference on Neural Networks (IJCNN). Rio de Janeiro, 2018: 1
[8]	Zhang Q, Li J F, Zhuo L. Review of Vehicle Recognition Technology. J Beijing Univ Technol, 2018, 44(3): 382 張強、李嘉鋒、卓力. 車輛識別技術綜述. 北京工業大學學報, 2018, 44(3):382
[9]	Zhao Z Q, Zheng P, Xu S T, et al. Object detection with deep learning: a review. IEEE Trans Neural Networks Learning Syst, 2019, 30(11): 3212 doi: 10.1109/TNNLS.2018.2876865
[10]	Ren S Q, He K M, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell, 2017, 39(6): 1137 doi: 10.1109/TPAMI.2016.2577031
[11]	Lin T Y, Dollár P, Girshick R, et al. Feature pyramid networks for object detection // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, 2017: 2117
[12]	He K M, Gkioxari G, Dollár P, et al. Mask R-CNN. [J/OL]. arXiv preprint (2018-01-24) [2019-12-15]. https://arxiv.org/abs/1703.06870
[13]	Lu X, Li B Y, Yue Y X, et al. Grid R-CNN plus: faster and better [J/OL]. arXiv preprint (2019-06-13) [2019-12-15]. https://arxiv.org/abs/1906.05688v1
[14]	Cai Z W, Vasconcelos N. Cascade R-CNN: high quality object detection and instance segmentation[J/OL]. arXiv preprint (2019-06-24) [2019-11-12]. https://arxiv.org/abs/1906.09756v1
[15]	Liu W, Anguelov D, Erhan D, et al. SSD: Single shot multibox detector // European Conference on Computer Vision. Amsterdam, 2016: 21
[16]	Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, 2016: 779
[17]	Rezatofighi H, Tsoi N, Gwak J Y, et al. Generalized intersection over union: a metric and a loss for bounding box regression // 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, 2019: 2961
[18]	Zheng Z H, Wang P, Liu W, et al. Distance-IoU Loss: faster and better learning for bounding box regression [J/OL]. arXiv preprint (2019-11-19) [2019-12-15]. https://arxiv.org/abs/1911.08287
[19]	Lin T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection. IEEE Trans Pattern Anal Mach Intell, 2020, 42(2): 318 doi: 10.1109/TPAMI.2018.2858826
[20]	Shen Z Q, Liu Z, Li J G, et al. DSOD: Learning deeply supervised object detectors from scratch // Proceedings of the IEEE International Conference on Computer Vision. Venice, 2017: 1919
[21]	Law H, Deng J. CornerNet: detecting objects as paired keypoints [J/OL]. arXiv preprint (2019-03-18) [2019-12-15]. https://arxiv.org/abs/1808.01244v2
[22]	Duan K W, Bai S, Xie L X, et al. CenterNet: Keypoint triplets for object detection [J/OL]. arXiv preprint (2019-04-19) [2019-12-15]. https://arxiv.org/abs/1904.08189v3
[23]	Rashwan A, Agarwal R, Kalra A, et al. MatrixNets: a new scale and aspect ratio aware architecture for object detection[J/OL]. arXiv preprint (2020-01-09) [2020-01-15]. https://arxiv.org/abs/2001.03194v1
[24]	Redmon J, Farhadi A. YOLOv3: an incremental improvement [J/OL]. arXiv preprint (2018-04-08) [2019-11-12]. https://arxiv.org/abs/1804.02767
[25]	Liu Z, Li J G, Shen Z Q, et al. Learning efficient convolutional networks through network slimming // 2017 IEEE International Conference on Computer Vision. Venice, 2017: 2755
[26]	Chen K, Wang J Q, Pang J M, et al. MMDetection: Open MMLab detection toolbox and benchmark [J/OL]. arXiv preprint (2019-06-17) [2019-11-12]. https://arxiv.org/abs/1906.07155v1