Submitted:
07 June 2024
Posted:
11 June 2024
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Related Work
3. Model Design
3.1. BERT
3.2. BiLSTM
3.3. CNN
3.4. BERT-BiLSTM-CNN
4. Experiments and Results
4.1. Data Set
4.2. Experimental Environment and Parameters Design
4.3. Experimental Comparison and Result Analysis
4.3.1. Model Comparison
4.3.2. Comparison of the Classification Effects of Different Parameters
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Guan, G.; Guo, J.; Wang, H. Varying Naïve Bayes models with applications to classification of chinese text documents. Journal of Business & Economic Statistics 2014, 32, 445–456. [Google Scholar]
- Moraes, R.; Valiati, J.F.; Neto, W.P.G.O. Document-level sentiment classification: An empirical comparison between SVM and ANN. Expert Systems with Applications 2013, 40, 621–633. [Google Scholar] [CrossRef]
- Jiang, S.; Pang, G.; Wu, M.; et al. An improved K-nearest-neighbor algorithm for text categorization. Expert Systems with Applications 2012, 39, 1503–1509. [Google Scholar] [CrossRef]
- Bilal, M.; Israr, H.; Shahid, M.; et al. Sentiment classification of Roman-Urdu opinions using Naïve Bayesian, Decision Tree and KNN classification techniques. Journal of King Saud University-Computer and Information Sciences 2016, 28, 330–344. [Google Scholar] [CrossRef]
- Chen, Y. Convolutional neural network for sentence classification. University of Waterloo, 2015.
- Soni, S.; Chouhan, S.S.; Rathore, S.S. TextConvoNet: A convolutional neural network based architecture for text classification. Applied Intelligence 2023, 53, 14249–14268. [Google Scholar] [CrossRef]
- Lai, S.; Xu, L.; Liu, K.; et al. Recurrent convolutional neural networks for text classification. In Proceedings of the AAAI conference on artificial intelligence. Chinese Academy of Sciences: NLPR. 2015; 29. [Google Scholar] [CrossRef]
- Yin, W.; Kann, K.; Yu, M.; et al. Comparative study of CNN and RNN for natural language processing. arXiv 2017, arXiv:1702.01923. [Google Scholar]
- Dirash, A.R.; Manju, S.B.; Songbo, T.; et al. LSTM Based Text Classification. IITM Journal of Management and IT 2021, 12, 62–65. [Google Scholar]
- Liu, G.; Guo, J. Bidirectional LSTM with attention mechanism and convolutional layer for text classification. Neurocomputing 2019, 337, 325–338. [Google Scholar] [CrossRef]
- Galassi, A.; Lippi, M.; Torroni, P. Attention in natural language processing. IEEE transactions on neural networks and learning systems 2020, 32, 4291–4308. [Google Scholar] [CrossRef] [PubMed]
- Zhang, D.; Xu, H.; Su, Z.; et al. Chinese comments sentiment classification based on word2vec and SVMperf. Expert Systems with Applications 2015, 42, 1857–1863. [Google Scholar] [CrossRef]
- Shen, Y.; Liu, J. Comparison of text sentiment analysis based on bert and word2vec. In Proceedings of the 2021 IEEE 3rd international conference on frontiers technology of information and computer (ICFTIC). IEEE; 2021; pp. 144–147. [Google Scholar] [CrossRef]
- Devlin, J.; Chang, M.W.; Lee, K.; et al. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
- Kale, A.S.; Pandya, V.; Di Troia, F.; et al. Malware classification with word2vec, hmm2vec, bert, and elmo. Journal of Computer Virology and Hacking Techniques 2023, 19, 1–16. [Google Scholar] [CrossRef]
- Li, X.; Cui, M.; Li, J.; et al. A hybrid medical text classification framework: Integrating attentive rule construction and neural network. Neurocomputing 2021, 443, 345–355. [Google Scholar] [CrossRef]
- Hernández, G.; Zamora, E.; Sossa, H.; et al. Hybrid neural networks for big data classification. Neurocomputing 2020, 390, 327–340. [Google Scholar] [CrossRef]
- Li, M.; Chen, L.; Zhao, J.; et al. Sentiment analysis of Chinese stock reviews based on BERT model. Applied Intelligence 2021, 51, 5016–5024. [Google Scholar] [CrossRef]
- Ren, C.; Bin, Q.; Yangken, C.; et al. Sentiment Analysis About Investors and Consumers in Energy Market Based on BERT-BiLSTM. IEEE Access 2020, 8, 171408–171415. [Google Scholar]
- Li, X.; Lei, Y.; Ji, S. BERT-and BiLSTM-based sentiment analysis of online Chinese buzzwords. Future Internet 2022, 14, 332. [Google Scholar] [CrossRef]
- Kaur, K.; Kaur, P. BERT-CNN: Improving BERT for requirements classification using CNN. Procedia Computer Science 2023, 218, 2604–2611. [Google Scholar] [CrossRef]
- Xie, J.; Hou, Y.; Wang, Y.; et al. Chinese text classification based on attention mechanism and feature-enhanced fusion neural network. Computing 2020, 102, 683–700. [Google Scholar] [CrossRef]
- Deng, J.; Cheng, L.; Wang, Z. Attention-based BiLSTM fused CNN with gating mechanism model for Chinese long text classification. Computer Speech & Language 2021, 68, 101182. [Google Scholar]
- Li, Z.; Yang, X.; Zhou, L.; et al. Text matching in insurance question-answering community based on an integrated BiLSTM-TextCNN model fusing multi-feature. Entropy 2023, 25, 639. [Google Scholar] [CrossRef] [PubMed]
- Bao, T.; Ren, N.; Luo, R.; et al. A BERT-based hybrid short text classification model incorporating CNN and attention-based BiGRU. Journal of Organizational and End User Computing (JOEUC) 2021, 33, 1–21. [Google Scholar] [CrossRef]
- Jiang, X.; Song, C.; Xu, Y.; et al. Research on sentiment classification for netizens based on the BERT-BiLSTM-TextCNN model. PeerJ Computer Science 2022, 8, e1005. [Google Scholar] [CrossRef]
- Kaur, K.; Kaur, P. Improving BERT model for requirements classification by bidirectional LSTM-CNN deep model. Computers and Electrical Engineering 2023, 108, 108699. [Google Scholar] [CrossRef]





| Label | Text category | Training set | Test set |
|---|---|---|---|
| 0 | Consult(咨询) | 14260 | 3843 |
| 1 | Complain(投诉) | 16246 | 4319 |
| 2 | Seek Assist(求助) | 9669 | 2549 |
| 3 | Suggestion(建议) | 1207 | 325 |
| Label | Text content |
|---|---|
| 0 | 市民来电咨询:广东省河源市回余人员有什么疫情防控措施? Citizens call for advice: What epidemic prevention and control measures do people from Heyuan City, Guangdong Province return to Xinyu City? |
| 1 | 新余四中不经过家长同意补课收取补课费 不允许家长反抗 说一句不同意都不行 把孩子教好是教师应尽的责任与义务 要有高效课堂 减轻孩子负担 而不是靠收费和补课来加重孩子和家长的压力。 Xinyu No. 4 Middle School does not charge remedial fees without parents' consent. Parents are not allowed to resist and say that they do not agree. It is the responsibility and obligation of teachers to teach children well. It is necessary to have efficient classrooms to reduce the burden on children rather than rely on fees and remedial classes to increase the pressure on children and parents. |
| 2 | 市民来电:高速爆胎的求助。 Public call: high-speed flat tire for help |
| 0 | 近日省教育厅发文称全省2021届初三学生中考改革,但我去省教育厅却说以当地中招政策为主。请问这届初三学生中考是否改革? Recently, the provincial Department of Education issued a document saying that the province's 2021 junior high school examination reform, but I went to the provincial Department of Education said that the local recruitment policy is based. May I ask whether the junior high school entrance examination is reformed? |
| 3 | 吴先生来电建议减少南源路货车通行量。 Mr. Wu called to suggest reducing truck traffic on Nanyuan Road. |
| 1 | 市民来电反映厦门某宝马4S店退回定金且不售卖宝马车,认为不合理! The public call reflects that a BMW 4S store in Xiamen returns the deposit and does not sell BMW, which is unreasonable! |
| 2 | 市民来电:虎山路长青小学(市民表示不愿意透露年级)下完延时课下午5点以后是否可以由学校老师(带去其他教学地方进行教学(教学内容是课堂学习内容),市民表示是普遍现象。 Call from the public: Whether the school teacher can take the delayed class to other teaching places (the teaching content is the classroom learning content) after 5pm after Hushan Road Changqing Primary School (the public said that it is a common phenomenon. |
| Hyperparameter | Value | Hyperparameter | Value |
|---|---|---|---|
| Epoch | 15 | Optimizer | Adam |
| Dropout | 0.5 | Hidden_size | 768 |
| Learning rate | 10-3 | Max_length | 300 |
| Batch_size | 128 | Kernel_sizes | [7,8,9] |
| Model | Accuracy | Precision | Recall | F1-score |
|---|---|---|---|---|
| BERT-CNN | 0.950 9 | 0.935 4 | 0.950 3 | 0.939 5 |
| BERT-BiGRU | 0.934 5 | 0.940 5 | 0.912 6 | 0.921 3 |
| BERT-BiLSTM | 0.930 6 | 0.908 0 | 0.927 0 | 0.912 9 |
| BERT-BiGRU-CNN | 0.942 0 | 0.938 5 | 0.934 4 | 0.932 2 |
| BERT-BiLSTM-CNN | 0.953 6 | 0.954 1 | 0.951 9 | 0.950 9 |
| Model | Accuracy | Precision | Recall | F1-score |
|---|---|---|---|---|
| BERT-BiGRU | 0.934 5 | 0.940 5 | 0.913 6 | 0.922 3 |
| BERT-BiGRU-Attention | 0.946 9 | 0.938 6 | 0.944 8 | 0.938 7 |
| BERT-BiLSTM | 0.930 6 | 0.908 0 | 0.927 0 | 0.912 9 |
| BERT-BiLSTM-Attention | 0.939 1 | 0.938 2 | 0.930 9 | 0.931 4 |
| Model | Accuracy | Precision | Recall | F1-score |
|---|---|---|---|---|
| BERT-BiGRU-CNN | 0.942 0 | 0.938 5 | 0.934 4 | 0.932 2 |
| BERT-BiGRU-Attention-CNN | 0.929 9 | 0.927 4 | 0.906 2 | 0.912 5 |
| BERT-CNN | 0.950 9 | 0.935 4 | 0.950 3 | 0.939 5 |
| BERT-Attention-CNN | 0.944 6 | 0.944 0 | 0.928 7 | 0.933 5 |
| BERT-BiLSTM-CNN | 0.953 6 | 0.954 1 | 0.951 9 | 0.950 9 |
| BERT-BiLSTM-CNN-Attention | 0.950 5 | 0.952 8 | 0.945 7 | 0.947 0 |
| Model | BERT-BiLSTM-CNN | |||||
|---|---|---|---|---|---|---|
| Num_layer | Kernel_sizes | Hidden_sizes | Accuracy | Precision | Recall | F1-score |
| 1 | [2,3,4] | 768 | 0.946 1 | 0.945 0 | 0.945 4 | 0.943 1 |
| 1 | [3,4,5] | 768 | 0.949 1 | 0.951 0 | 0.941 4 | 0.944 2 |
| 1 | [4,5,6] | 768 | 0.946 7 | 0.936 0 | 0.946 1 | 0.937 8 |
| 1 | [5,6,7] | 768 | 0.937 7 | 0.937 7 | 0.948 5 | 0.939 2 |
| 1 | [6,7,8] | 768 | 0.952 9 | 0.952 1 | 0.949 5 | 0.949 0 |
| 1 | [7,8,9] | 768 | 0.953 6 | 0.954 1 | 0.951 9 | 0.950 9 |
| 1 | [8,9,10] | 768 | 0.946 4 | 0.945 9 | 0.947 4 | 0.943 7 |
| 1 | [9,10,11] | 768 | 0.952 7 | 0.953 2 | 0.949 5 | 0.949 1 |
| 1 | [10,11,12] | 768 | 0.951 9 | 0.950 9 | 0.956 3 | 0.948 7 |
| Model | BERT-BiLSTM-CNN | |||||
|---|---|---|---|---|---|---|
| Num_layer | Kernel_sizes | Hidden_sizes | Accuracy | Precision | Recall | F1-score |
| 1 | [7,8,9] | 128 | 0.937 9 | 0.944 0 | 0.933 7 | 0.935 3 |
| 1 | [7,8,9] | 256 | 0.948 3 | 0.953 0 | 0.943 4 | 0.945 9 |
| 1 | [7,8,9] | 512 | 0.952 1 | 0.952 8 | 0.946 9 | 0.948 0 |
| 1 | [7,8,9] | 768 | 0.953 6 | 0.954 1 | 0.951 9 | 0.950 9 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).