基于ALBERT與雙向GRU的中醫臟腑定位模型

張德政; 范欣欣; 謝永紅; 蔣彥釗

doi:10.13374/j.issn2095-9389.2021.01.13.002

基于ALBERT與雙向GRU的中醫臟腑定位模型

doi: 10.13374/j.issn2095-9389.2021.01.13.002

張德政^{1, 2},
范欣欣^{1, 2},
謝永紅^{1, 2, ,},
蔣彥釗^{1, 2}

1.
北京科技大學計算機與通信工程學院，北京 100083
2.
材料領域知識工程北京市重點實驗室，北京 100083

基金項目: 國家重點研發計劃云計算和大數據專項資助項目（2017YFB1002304）

詳細信息

通訊作者:
E-mail: xieyh@ustb.edu.cn

中圖分類號: TP391.1
計量
- 文章訪問數: 761
- HTML全文瀏覽量: 559
- PDF下載量: 71
- 被引次數: 0
出版歷程
- 收稿日期: 2021-01-13
- 網絡出版日期: 2021-03-02
- 刊出日期: 2021-09-18

Localization model of traditional Chinese medicine Zang-fu based on ALBERT and Bi-GRU

ZHANG De-zheng^{1, 2},
FAN Xin-xin^{1, 2},
XIE Yong-hong^{1, 2
, ,},
JIANG Yan-zhao^{1, 2}

1.
School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, China
2.
Beijing Key Laboratory of Knowledge Engineering for Materials Science, Beijing 100083, China

More Information

Corresponding author: E-mail: xieyh@ustb.edu.cn

摘要

摘要: 臟腑定位，即明確病變所在的臟腑，是中醫臟腑辨證的重要階段。本文旨在通過神經網絡模型搭建中醫臟腑定位模型，輸入癥狀文本信息，輸出對應的病變臟腑標簽，為實現中醫輔助診療的臟腑辨證提供支持。將中醫的臟腑定位問題建模為自然語言處理中的多標簽文本分類問題，基于中醫的醫案數據，提出一種基于預訓練模型ALBERT和雙向門控循環單元（Bi-GRU）的臟腑定位模型。對比實驗和消融實驗的結果表明，本文提出的方法在中醫臟腑定位的問題上相比于多層感知機模型、決策樹模型具有更高的準確性，與Word2Vec文本表示方法相比，本文使用的ALBERT預訓練模型的文本表示方法有效提升了模型的準確率。在模型參數上，ALBERT預訓練模型相比BERT模型降低了模型參數量，有效減小了模型大小。最終，本文提出的臟腑定位模型在測試集上F1值達到了0.8013。
- 多標簽文本分類 /
- ALBERT /
- 門控循環單元 /
- 臟腑定位 /
- 中醫
Abstract: The rapid development of artificial intelligence (AI) has injected new vitality into various industries and provided new ideas for the development of traditional Chinese medicine (TCM). The combination of AI and TCM provides more technical support for TCM auxiliary diagnosis and treatment. In the history of TCM, many methods of syndrome differentiation have been observed, among which the differentiation of Zang-fu organs is one of the important methods. The purpose of this paper is to provide support for the localization of Zang-fu in TCM through AI technology. Localization of Zang-fu organs is a method of determining the location of lesions in such organs and is an important stage in the differentiation of Zang-fu organs in TCM. In this paper, the localization model of TCM Zang-fu organs through the neural network model was established. Through the input of symptom text information, the corresponding Zang-fu label for a lesion could be output to provide support for the realization of Zang-fu syndrome differentiation in TCM-assisted diagnosis and treatment. In this paper, the localization of Zang-fu organs was abstracted as multi-label text classification in natural language processing. Using the medical record data of TCM, a Zang-fu localization model based on pretraining models a lite BERT (ALBERT) and bidirectional gated recurrent unit (Bi-GRU) was proposed. Comparison and ablation experiments finally show that the proposed method is more accurate than multilayer perceptron and the decision tree. Moreover, using an ALBERT pretraining model for text representation effectively improves the accuracy of the localization model. In terms of model parameters, the ALBERT pretraining model greatly reduces the number of model parameters compared with the BERT model and effectively reduces the model size. Finally, the F1-value of the Zang-fu localization model proposed in this paper reaches 0.8013 on the test set, which provided certain support for the TCM auxiliary diagnosis and treatment.
- multi-label text classification /
- ALBERT /
- GRU /
- localization of Zang-fu /
- traditional Chinese medicine (TCM)

HTML全文

圖 1 臟腑定位模型結構

Figure 1. Zang-fu localization model structure

下載: 全尺寸圖片幻燈片

圖 2 ALBERT模型結構

Figure 2. ALBERT model structure

下載: 全尺寸圖片幻燈片

圖 3 GRU單元

Figure 3. GRU unit

下載: 全尺寸圖片幻燈片

圖 4 雙向GRU模型示意圖

Figure 4. Bi-GRU model diagram

下載: 全尺寸圖片幻燈片

表 1 臟腑定位數據格式

Table 1. Zang-fu location data format

No.	Symptoms	Tag
1	Legs ache, and wake up unable to sleep, along with hemoptysis and a sore throat	spleen, kidney, heart
2	The patient had high blood pressure, weakness in the right limb, and pain in the left upper arm	liver, kidney

下載: 導出CSV

表 2 訓練過程中的參數

Table 2. Parameters in the training process

Parameter name	Parameter value
Max_seq_lenth	128
GRU_units	128
Dropout	0.4
Learning_rate	1×10^?4
Epochs	10
Batch_size	128

下載: 導出CSV

表 3 多標簽分類對比實驗結果

Table 3. Comparative experimental results of multiple label classification

No.	Method	Precision	Recall	F1-value
1	Word2Vec+Bi-GRU	0.8015	0.7653	0.7830
2	MLP Classifier	0.7091	0.7067	0.7079
3	Decision Tree Classifier	0.6744	0.6633	0.6688
4	ALBERT+Bi-GRU	0.8301	0.7745	0.8013

下載: 導出CSV

表 4 BERT與ALBERT對比實驗結果

Table 4. Comparative experimental results of BERT and ALBERT

Id	Method	Precision	Recall	F1-value	Time/s	Model_ parameters/ MB
1	BERT+Bi-GRU	0.8253	0.7783	0.8011	99.8219	363.3
2	ALBERT+Bi-GRU	0.8301	0.7745	0.8013	84.7045	37.3

下載: 導出CSV

表 5 多標簽分類消融實驗結果

Table 5. Ablation experiment multiple label classification results

Method	Precision	Recall	F1-value
ALBERT	0.7711	0.7315	0.7508
ALBERT+Bi-GRU	0.8301	0.7745	0.8013

下載: 導出CSV

中文字幕在线观看

參考文獻(26)

[1]	Xu Q. Mining the Syndrome Factor Distribution of AECOPD by the Attribution Model Built by Directed Graph [Dissertation]. Chengdu: Chengdu University of TCM, 2017 許強. 基于有向圖的證素歸因模型挖掘AECOPD的證素分布規律[學位論文]. 成都: 成都中醫藥大學, 2017
[2]	Yin D, Zhou L, Zhou Y M, et al. Study on design of graph search pattern of knowledge graph of TCM classic prescriptions. Chin J Inf Tradit Chin Med, 2019, 26(8): 94 doi: 10.3969/j.issn.1005-5304.2019.08.019 尹丹, 周璐, 周雨玫, 等. 中醫經方知識圖譜“圖搜索模式”設計研究. 中國中醫藥信息雜志, 2019, 26(8):94 doi: 10.3969/j.issn.1005-5304.2019.08.019
[3]	Liu C, Gao J L, Dong Y, et al. Study on TCM syndrome differentiation and diagnosis model based on BP neural network for syndrome elements and their common combinations in patients with borderline coronary lesion. Chin J Inf Tradit Chin Med, 2021, 28(3): 104 劉超, 高嘉良, 董艷, 等. 基于BP神經網絡的冠狀動脈臨界病變患者證候要素及其常見組合中醫辨證診斷模型研究. 中國中醫藥信息雜志, 2021, 28(3):104
[4]	Chu N. Research on Hybrid Intelligent Based Syndrome Differentiation System for Traditional Chinese Medicine [Dissertation]. Shanghai: Shanghai Jiaotong University, 2012 褚娜. 基于混合智能的中醫辨證系統研究[學位論文]. 上海: 上海交通大學, 2012
[5]	Yang K M. Research on Clinical Data Mining Technology of Diabetes TCM [Dissertation]. Kunming: Kunming University of Science and Technology, 2013 楊開明. 糖尿病中醫臨床數據挖掘技術研究[學位論文]. 昆明: 昆明理工大學, 2013
[6]	Zhou L, Li G G, Sun Y, et al. Construction of intelligent syndrome differentiation and formula selection of compound structure model. World Chin Med, 2018, 13(2): 479 doi: 10.3969/j.issn.1673-7202.2018.02.057 周璐, 李光庚, 孫燕, 等. 復合結構智能化辨證選方模型的構建. 世界中醫藥, 2018, 13(2):479 doi: 10.3969/j.issn.1673-7202.2018.02.057
[7]	Shu X, Cao Y, Huang X, et al. Construction of prediction model of qi deficiency syndrome in acute ischemic stroke based on neural network analysis technique. Glob Tradit Chin Med, 2019, 12(11): 1650 doi: 10.3969/j.issn.1674-1749.2019.11.007 舒鑫, 曹云, 黃幸, 等. 基于神經網絡分析技術的急性缺血性卒中氣虛證預測模型構建的研究. 環球中醫藥, 2019, 12(11):1650 doi: 10.3969/j.issn.1674-1749.2019.11.007
[8]	Shen C B, Wang Z H, Sun Y G. A multi-label classification algorithm based on label clustering. Comput Eng Softw, 2014, 35(8): 16 doi: 10.3969/j.issn.1003-6970.2014.08.004 申超波, 王志海, 孫艷歌. 基于標簽聚類的多標簽分類算法. 軟件, 2014, 35(8):16 doi: 10.3969/j.issn.1003-6970.2014.08.004
[9]	Huang Z Q. Multi-Label Classification and Label Completion Algorithm Based on K-Means [Dissertation]. Anqing: Anqing Normal University, 2020 黃志強. 基于K-means的多標簽分類及標簽補全算法[學位論文]. 安慶: 安慶師范大學, 2020
[10]	Li D Y, Luo F, Wang S G. A multi-label emotion classification method for Chinese text based on CNN and tag features. J Shanxi Univ Nat Sci Ed, 2020, 43(1): 65 李德玉, 羅鋒, 王素格. 融合CNN和標簽特征的中文文本情緒多標簽分類. 山西大學學報(自然科學版), 2020, 43(1):65
[11]	Joulin A, Grave E, Bojanowski P, et al. Bag of Tricks for Efficient Text Classification // Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. Valencia, 2017: 427
[12]	Yi S X, Yin H P, Zheng H Y. Public security event trigger identification based on Bidirectional LSTM. Chin J Eng, 2019, 41(9): 1201 易士翔, 尹宏鵬, 鄭恒毅. 基于BiLSTM的公共安全事件觸發詞識別. 工程科學學報, 2019, 41(9):1201
[13]	Chen G B, Ye D H, Xing Z C, et al. Ensemble application of convolutional and recurrent neural networks for multi-label text categorization // 2017 International Joint Conference on Neural Networks (IJCNN). Anchorage, 2017: 2377
[14]	Yogatama D, Dyer C, Ling W, et al. Generative and discriminative text classification with recurrent neural networks[J/OL]. ArXiv Preprin (2017-03-06) [2020-12-29]. https://arxiv.org/abs/1703.01898v1
[15]	Wang B X. Disconnected Recurrent Neural Networks for Text Categorization // Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Melbourne, 2018: 2311
[16]	Kim Y. Convolutional Neural Networks for Sentence Classification // Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Doha, 2014: 1746
[17]	Mikolov T, Sutskever I, Chen K, et al. Distributed representations of words and phrases and their compositionality[J/OL]. arXiv preprint (2013-10-16) [2021-5-22]. https://arxiv.org/abs/1310.4546
[18]	Pennington J, Socher R, Manning C. Glove: Global Vectors for Word Representation // Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Doha, 2014: 1532
[19]	Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need [J/OL]. arXiv preprint (2017-6-12) [2021-5-22]. https://arxiv.org/abs/1706.03762
[20]	Devlin J, Chang M W, Lee K, et al. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding. //Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics. Minneapolis, Minnesota, 2018: 4171
[21]	Yang Z L, Dai Z H, Yang Y M, et al. Xlnet: Generalized autoregressive pretraining for language understanding[J/OL]. arXiv preprint (2019-6-19) [2021-5-23]. https://arxiv.org/abs/1906.08237
[22]	Liu Y, Ott M, Goyal N, et al. Roberta: A robustly optimized bert pretraining approach[J/OL]. arXiv preprint (2019-07-26) [2020-12-29]. http://arxiv.org/abs/1907.11692
[23]	Sanh V, Debut L, Chaumond J, et al. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter[J/OL]. arXiv preprint (2019-10-02) [2020-12-29]. http://arxiv.org/abs/1910.01108
[24]	Lei J S, Qian Y. Chinese-text classification method based on ERNIE-BiGRU. J Shanghai Univ Electr Power, 2020, 36(4): 329 doi: 10.3969/j.issn.2096-8299.2020.04.003 雷景生, 錢葉. 基于ERNIE-BiGRU模型的中文文本分類方法. 上海電力大學學報, 2020, 36(4):329 doi: 10.3969/j.issn.2096-8299.2020.04.003
[25]	Lan Z Z, Chen M, Goodman S, et al. ALBERT: A lite BERT for self-supervised learning of language representations. //ICLR 2020 : Eighth International Conference on Learning Representations. Addis Ababa, 2020
[26]	Chung J, Gulcehre C, Cho K, et al. Empirical evaluation of gated recurrent neural networks on sequence modeling [J/OL]. ArXiv Preprin (2018-08-13) [2020-12-29]. http://arxiv.org/abs/1412.3555