Submitted:
30 May 2024
Posted:
30 May 2024
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Materials and Methods
2.1. Materials
2.1.1. Corpus Collection and Pre-Processing
2.1.2. Corpus Annotation
2.1.3. Construction of Lexicon and Pre-Training Word Embedding
2.2. Methods
2.2.1. Framework
2.2.2. Char-Words Pair Sequence
2.2.3. Lexicon Adapter
2.2.4. Lexicon Enhanced BERT
2.2.5. Lexicon Enhanced Contrastive Learning
3. Experiment
3.1. Evaluation
3.2. Experimental Settings
3.3. Results
4. Discussion
4.1. Performance Analysis of the Proposed Model
4.2. Comparison of Common Pre-Trained and Lexicon-Based Model
- (1)
- Effectiveness of the lexicon Enhanced BERT
- (2)
- Effectiveness of the contrastive learning
4.3. Analysis of Results for Few-Shot
4.4. Experiments on Public Datasets
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A. Data Source Details
| No | Type | Example |
| 1 | Professional books | Jeffrey J. Zimmerman, Locke A.Karriker, Alejandro Raminez.etc,editor in chief. Hanchun Yang,main translation. Disease of Swine. North United publishing media Media Co., Ltd, Liaoning science and technology publishing house:Beijing, China, 2022. Yousheng, Xu. Primary color atlas of scientific pig raising and pig disease prevention and control. China Agricultural Publishing House: Beijing, China,2017. Changyou,Li, Xiaocheng,Li. Prevention and control technology of swine epidemic disease. China Agricultural Publishing House:Beijing, China,2015. Jianxin Zhang. Diagnosis and control of herd pig epidemic disease. He 'nan Science and Technology Press:Zhengzhou, China, 2014. Chaoying, Luo, Guibo, Wang. Prevention and treatment of pig diseases and safe medication. Chemical industry press:Beijing, China,2016, etc. 《猪病学》、《科学养猪与猪病防制原色图谱》、《猪群疫病防治技术》、《群养猪疫病诊断与控制》、《猪病防治及安全用药》等 |
| 2 | Standard specification | Technical Specification for Quarantine of Porcine Reproductive and Respiratory Syndrome (SN/T 1247-2022), Diagnostic Techniques for Mycoplasma Pneumonia in Swine (NY/T 1186-2017), Diagnostic Techniques for Infectious Pleuropneumonia in Swine (NY/T 537-2023), Diagnostic Techniques for Swine Dysentery (NY/T 545-2023), Technical Specification for Quarantine of Porcine Rotavirus Infection (SN/T 5196-2020), etc. 《猪繁殖与呼吸综合征检疫技术规范》(SN/T 1247-2022)、《猪支原体肺炎诊断技术》(NY/T 1186-2017)、《猪传染性胸膜肺炎诊断技术》(NY/T 537-2023)、《猪痢疾诊断技术》(NY/T 545-2023)、《猪轮状病毒感染检疫技术规范》(SN/T 5196-2020)等 |
| 3 | Technological specification | Technical specification for prevention and control of highly pathogenic blue ear disease in pigs, technical specification for prevention and control of foot-and-mouth disease, technical specification for prevention and control of classical swine fever, etc. 《高致病性猪蓝耳病防治技术规范》、《口蹄疫防治技术规范》、《猪瘟防治技术规范》等 |
| 4 | Policy paper | Ministry of Agriculture and Rural Affairs "List of Class I, II and III Animal Diseases", The Ministry of Agriculture issued the "Guiding Opinions on Prevention and Control of Highly Pathogenic Porcine Blue Ear Disease (2017-2020)", Notice of National Guiding Opinions on Prevention and Control of Classical Swine Fever (2017-2020), etc. 农业农村部《一、二、三类动物疫病病种名录》、农业部关于印发《国家高致病性猪蓝耳病防治指导意见(2017—2020年)》、《国家猪瘟防治指导意见(2017—2020年)》的通知 |
| 5 | Relevant industry website. | China Veterinary Website(https://www.cadc.net.cn/sites/MainSite/), Big Animal Husbandry Website(https://www.dxumu.com/), Huinong Website(https://www.cnhnb.com/), etc. 中国兽医网、大畜牧网、惠农网等 |
References
- Li, J.; Sun, A.X.; Han, J.L.; Li, C.L. A Survey on Deep Learning for Named Entity Recognition. IEEE Transactions on Knowledge and Data Engineering. 2022, 34, 50–70. [Google Scholar] [CrossRef]
- Cheng, J.R.; Liu, J.X.; Xu, X.B.; Xia, D.W.; Liu, L.; Sheng, V. A review of Chinese named entity recognition. KSII Transatctions on Internet and Information Systems. 2021, 15, 2012–2030. [Google Scholar]
- Mi, B.G.; Fan, Y. A review: Development of named entity recognition (NER) technology for aeronautical information intelligence. Artificial Intelligence Review. 2022, 56, 1515–1542. [Google Scholar]
- Liu, P.; Guo, Y.; Wang, F.; Li, G. Chinese named entity recognition: The state of the art. Neuro computing. 2022, 473, 37–53. [Google Scholar] [CrossRef]
- Qiu, X.; Sun, T.; Xu, Y.; Shao, Y.; Dai, N.; Huang, X. Pre-trained models for natural language processing: A survey. Science China Technological Sciences. 2020, 63, 1872–1897. [Google Scholar] [CrossRef]
- Zhang, S.; Elhadad, N. Unsupervised biomedical named entity recognition: Experiments with clinical and biological texts. Journal of Biomedical Informatics. 2013, 46, 1088–1098. [Google Scholar] [CrossRef] [PubMed]
- Zhang, Y.J.; Zhang, T. Research on Me-based Chinese NER model. In Proceeding of the 7th International Conference on Machine Learning and Cybernetics(ICMLC), Kunming, China, 12–15 July 2008; Volume 5, pp. 2597–2602. [Google Scholar]
- Hu, H.P.; Zhang, H. Chinese Named Entity Recognition with CRFs: Two Levels. In Proceeding of the International Conference on Computational Intelligence & Security, Suzhou, China; 2008; Volume 6, pp. 1–6. [Google Scholar]
- Kang, Y.; Sun, L.; Zhu, R.; Li, M. Survey on Chinese named entity recognition with deep learning. Journal of Huazhong University of Science and Technology (Natural Science Edition). 2022, 50, 44–53. [Google Scholar]
- Zhang, Y.; Yang, J. Chinese NER using lattice LSTM. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, Australia; 2018; Volume 1, pp. 1554–1564. [Google Scholar]
- Gui, T.; Ma, R.; Zhang, Q.; et al. CNN-Based Chinese NER with lexicon rethinking. In Proceedings of the 28th International Joint Conference on Artificial Intelligence, Macao, China; AAAI Press; 2019; Volume 8, pp. 4982–4988. [Google Scholar]
- Gui, T.; Zou, Y.; Zhang, Q.; Peng, M.; Fu, J.; Wei, Z.; Huang, X. A Lexicon-Based Graph Neural Network for Chinese NER. Proceeding of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, 3–7 November 2019; pp. 1040–1050. [Google Scholar]
- Liu, W.; Fu, X.; Zhang, Y.; Xiao, W. Lexicon enhanced Chinese sequence labeling using BERT adapter. In Proceeding of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Bangkok, Thailand, 1–6 August 2021. [Google Scholar]
- Ma, R.; Peng, M.; Zhang, Q.; et al. Simplify the usage of lexicon in Chinese NER. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Online: Association for Computational Linguistics; 2020; pp. 5951–5960. [Google Scholar]
- Li, X.; Yan, H.; Qiu, X.; et al. FLAT: Chinese NER using flat-lattice transformer. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online; ACL Press, 2020; pp. 6836–6842. [Google Scholar]
- Xue, M.; Yu, B.; Liu, T.; et al. Porous lattice transformer encoder for Chinese NER. In Proceedings of the 28th International Conference on Computational Linguistics. International Committee on Computational Linguistics, Barcelona, Spain, 13–18 Sep 2020; pp. 3831–3841. [Google Scholar]
- Devlin, J.; Chang, M.W.; Lee, K.; et al. Bert: Pre-training of deep bidirectional transformers for language under-standing. arXiv arXiv:1810.04805, 2018.
- Peters, M.E.; Neumann, M.; Iyyer, M.; Gardner, M.; Clark, C.; Lee, K.; Zettlemoyer, L. Deep contextualized word representations. arXiv arXiv:1802.05365, 2018.
- Liu, Y.; Ott, M.; Goyal, N.; Du, J.; Joshi, M.; Chen, D.; Levy, O.; Lewis, M.; Zettlemoyer, L.; Stoyanov, V. RoBERTa: A robustly optimized BERT pretraining approach. arXiv arXiv:1907.11692, 2019.
- Sun, Y.; Wang, S.; Li, Y.; Feng, S.; Wu, H. ERNIE: Enhanced representation through knowledge integration. arXiv arXiv:1904.09223v1, 2019.
- Lan, Z.Z.; Chen, M.D.; Goodman, S.; et al. ALBERT: A lite BERT for self-supervised learning of language representations. In Proceedings of the 8th International Conference on Learning Representations(ICLR), Addis Ababa, Ethiopia, 26–30 April 2020. [Google Scholar]
- Li, S.; Bai, Z.Q.; Zhao, S.; Jiang, G.S.; Shan, L.L.; Zhang, L. A LEBERT-based model for named entity recognition. In Proceedings of the 2021 3rd International Conference on Artificial Intelligence and Advanced Manufacture(AIAM), ACM International Conference Proceeding Series, Manchester, UK, 23–25 Oct 2021; pp. 980–983. [Google Scholar]
- Yan, Y.M.; Li, R.M.; Wang, S.R.; Zhang, F.; Wu, W.; Xu, W. ConSERT: A contrastive framework for self-supervised sentence representation transfer. arXiv arXiv:2105.11741, 2021.
- Gao, T.; Yao, X.; Chen, D. SimCSE: Simple Contrastive Learning of Sentence Embeddings. In Proceedings of the EMNLP, Conference on Empirical Methods in Natural Language Processing, Punta Cana, Dominican Republic, 7–11 November 2021; pp. 6894–6910. [Google Scholar]
- Huang, Y.; He, K.; Wang, Y.; et al. COPNER: Contrastive learning with prompt guiding for few-shot named entity recognition. In Proceedings of the 29th International Conference on Computational Linguistics, Gyeongju, Republic of Korea, 12–17 Oct 2022; pp. 2515–2527. [Google Scholar]
- He, K.; Mao, R.; Huang, Y.; Gong, T.; Li, C.; Cambria, E. Template-Free Prompting for Few-Shot Named Entity Recognition via Semantic-Enhanced Contrastive Learning. IEEE transactions on neural networks and learning systems 2023. [Google Scholar] [CrossRef] [PubMed]
- Li, X.W.; Li, X.L.; Zhao, M.K.; Yang, M.; Yu, R.G.; Yu, M.; Yu, J. CLINER: Exploring task-relevant features and label semantic for few-shot named entity recognition. Neural Computing & Applications. 2023, 36, 4679–4691. [Google Scholar] [CrossRef]
- Chen, P.; Wang, J.; Lin, H.; Zhao, D.; Yang, Z. Few-shot biomedical named entity recognition via knowledge-guided instance generation and prompt contrastive learning. Bioinformatics 2023, 39. [Google Scholar] [CrossRef] [PubMed]
- Sahadevan, S.; Hofmann-Apitius, M.; Schellander, K.; Tesfaye, D.; Fluck, J.; Friedrich, C.M. Text mining in livestock animal science: Introducing the potential of text mining to animal sciences. Journal of Animal Science. 2012, 90, 3666–3676. [Google Scholar] [CrossRef] [PubMed]
- Oh, H.S.; Lee, H. Named Entity Recognition for Pet Disease Q&A System. Journal of Digital Contents Society. 2022, 23, 765–771. [Google Scholar]
- Kung, H.; Yu, R.; Chen, C.; Tsai, C.; Lin, C. Intelligent pig-raising knowledge question-answering system based on neural network schemes. Agronomy Journal. 2021, 113, 906–922. [Google Scholar] [CrossRef]
- Zhang, D.; Zheng, G.; Liu, H.; Ma, X.; Xi, L. AWdpCNER: Automated Wdp Chinese Named Entity Recognition from Wheat Diseases and Pests Text. Agriculture 2023, 13, 1220. [Google Scholar] [CrossRef]
- Veena, G.; Vani, K.; Deepa, G. AGRONER: An unsupervised agriculture named entity recognition using weighted distributional semantic model. Expert Systems With Applications. 2023, 229, 120440. [Google Scholar] [CrossRef]
- Zhang, L.; Nie, X.; Zhang, M.; Gu, M.; Geissen, V.; Ritsema, C.J.; Niu, D.; Zhang, H. Lexicon and attention-based named entity recognition for kiwifruit diseases and pests: A Deep learning approach. Front. Plant Sci. 2022, 13, 1053449. [Google Scholar] [CrossRef] [PubMed]
- Guo, X.; Lu, S.; Tang, Z.; Bai, Z.; Diao, L.; Zhou, H.; Li, L. CG-ANER: Enhanced contextual embeddings and glyph features-based agricultural named entity recognition. Computers and Electronics in Agriculture 2022, 106776. [Google Scholar] [CrossRef]
- Liu, Y.; Wei, S.; Huang, H.; Lai, Q.; Li, M.; Guan, L. Naming entity recognition of citrus pests and diseases based on the BERT-BiLSTM-CRF model. Expert Systems With Applications 2023, 234, 121103. [Google Scholar] [CrossRef]
- Liang, J.; Li, D.; Lin, Y.; Wu, S.; Huang, Z. Named Entity Recognition of Chinese Crop Diseases and Pests Based on RoBERTa-wwm with Adversarial Training. Agronomy 2023, 13, 941. [Google Scholar] [CrossRef]
- Jia, Y.C.; Zhu, D.J. Medical Named Entity Recognition Based on Deep Learning. Computer Systems and Applications 2022, 31, 70–81. (in Chinese). [Google Scholar]
- Du, J.; Yin, H.; Feng, S. Research and Development of Named Entity Recognition in Chinese Electronic Medical Record. Acta Electronica Sinica 2022, 50, 3030–3053. [Google Scholar]
- Chen, T.; Kornblith, S.; Norouzi, M.; Hinton, G. A simple framework for contrastive learning of visual representations. arXiv 2020, arXiv:2002.05709. [Google Scholar]




| Category | Category definition | Examples | Numbers | Proportion of the total |
|---|---|---|---|---|
| Type | Name of different types of pig | 妊娠母猪、仔猪(Pregnant sows,piglets) | 635 | 8.45% |
| Disease | Name of pig disease | 猪丹毒、胸膜炎(Porcine erysipelas, pleurisy) | 808 | 10.75% |
| Body parts | Body position, organs and system of pigs | 心脏、巨噬细胞(Heart, Macrophages) | 1865 | 24.81% |
| Symptom | External performance caused by diseases |
气喘、咳嗽、水肿(Asthma, cough,swollen) | 2973 | 39.55% |
| Medicine | Medications for treating diseases | 替米考星、克林霉素(Timicosin, clindamycin) | 689 | 9.16% |
| Control | Measures for preventing and treating diseases | 隔离、消毒(Isolation and disinfection) | 548 | 7.29% |
| Total | 7518 | 100% |
| Model category | Model | P(%) | R(%) | F1(%) |
|---|---|---|---|---|
| baseline model without pre-trained | BILSTM_CRF | 75.17 | 72.29 | 73.7 |
| pre-trained model | BERT-BiLSTM-CRF | 80.94 | 85.04 | 82.94 |
| BERT-CRF | 84.02 | 82.7 | 83.36 | |
| BERT-CNN-CRF | 80.98 | 85.14 | 83.01 | |
| BERT-WWM-ext | 80.81 | 83.83 | 82.29 | |
| Roberta | 82.28 | 84.18 | 83.22 | |
| pre-trained model with lexicon | BERT-BILSTM-CRF-SoftLexicon | 82.49 | 84.36 | 83.41 |
| LEBERT | 86.47 | 84.64 | 85.54 | |
| PDCNER(ours) | 86.92 | 85.08 | 85.99 |
| Type | Disease | Bodyparts | Symptom | Medicine | Control | |
|---|---|---|---|---|---|---|
| Precision | 95.94 | 91.67 | 90.28 | 81.42 | 88.66 | 71.29 |
| Recall | 94.89 | 94.29 | 87.82 | 80.43 | 91.49 | 56.69 |
| F1 | 95.41 | 92.96 | 89.03 | 80.92 | 90.05 | 63.16 |
| Model | 1% | 10% | 30% | ||||||
|---|---|---|---|---|---|---|---|---|---|
| P | R | F1 | P | R | F1 | P | R | F1 | |
| BERT-BiLSTM-CRF | 31.79 | 1.80 | 3.41 | 65.75 | 76.30 | 70.63 | 76.95 | 81.18 | 79.01 |
| LEBERT | 17.39 | 11.43 | 13.79 | 74.81 | 83.47 | 78.91 | 74.05 | 80.14 | 76.97 |
| PDCNER(ours) | 18.18 | 11.43 | 14.04 | 84.43 | 85.12 | 84.77 | 86.08 | 84.70 | 85.39 |
| Model | Ontonotes | Resume | |
|---|---|---|---|
| BERT-BiLSTM-CRF | 69.13 | 82.11 | 95.89 |
| LEBERT | 74.91 | 86.07 | 96.68 |
| PDCNER(ours) | 76.38 | 86.44 | 96.71 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
