基于双向编码转换器和文本卷积神经网络的微博评论情感分类

doi:10.13306/j.1672-3813.2021.02.010

Complex Systems and Complexity Science

2021, Vol. 18

Issue (2): 89-94 DOI: 10.13306/j.1672-3813.2021.02.010

Current Issue | Archive | Adv Search

Weibo Comments Sentiment Classification Based on BERT and Text CNN

XU Kaixuan^a, LI Xian^b, PAN Yalei^a

a. Institute of Complexity Science; b. Institude For Future,Qingdao University, Qingdao 266071, China

Abstract
Figure/Table
References
Related Citation (11)

Download: PDF (1240 KB) HTML (0 KB)
Export: BibTeX | EndNote (RIS)

Abstract For comments with multiple sections within sentences, some state-of-art models, such as Embedding from Language Models-Text Convolutional Neural Network and Generative Pre-trained Transformer model, cannot accurately extract the meaning and therefore result in unsatisfactory performance. To solve this problem, we utilize Bidirectional Encoder Representations from Transformers-Text Convolutional Neural Network and Generative Pre-trained Transformer model. Using the bidirectional code converter structure of BERT′s unique self-attention mechanism, we can obtain the word vector of the global feature of the sentence, then we input the word vectors into Text CNN, then using Text CNN to capture local features,finally we extract high-level features, such as semantics and contextual connection. This process solved the problem of inaccurate contextual connection of the text obtained by the model, allowing us to realize the fine-grained sentiment classification of Weibo comments with high accuracy. Meanwhile, to verify the advantages of the model, we compared it with existing models. The test results on the simplifyweibo_4_moods dataset show that the BERT-Text CNN model has improved accuracy, recall, and F1 indicators.

Key words： sentiment classification bidirectional encode transformer Text CNN self-attention

Received: 02 November 2020 Published: 10 May 2021

TP183

	Service

	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors
	XU Kaixuan
	LI Xian
	PAN Yalei
	XU Kaixuan
	LI Xian
	PAN Yalei

Cite this article:

XU Kaixuan,LI Xian,PAN Yalei, et al. Weibo Comments Sentiment Classification Based on BERT and Text CNN[J]. Complex Systems and Complexity Science, 2021, 18(2): 89-94.

URL:

https://fzkx.qdu.edu.cn/EN/10.13306/j.1672-3813.2021.02.010 OR https://fzkx.qdu.edu.cn/EN/Y2021/V18/I2/89

[1]黄萱菁, 张奇. 文本情感倾向分析[J]. 中文信息学报, 2011, 25(6): 118-127.
Huang Xuanjing, Zhang Qi, Wu Yuanbin. A survey on sentiment analysis[J]. Journal of Chinese Information Processing, 2011, 25(6): 118-127.
[2]Kotsiantis S B, Zaharakis I, Pintelas P, et al. Supervised machine learning: a review of classification techniques[J]. Emerging Artificial Intelligence Applications in Computer Engineering, 2007, 160(1): 3-24.
[3]Yao K, Zhang L, Du D, et al. Dual encoding for abstractive text summarization[J]. IEEE Transactions on Cybernetics, 2018, 50(3): 985-996.
[4]Church K W. Word2Vec[J]. Natural Language Engineering, 2017, 23(1): 155-162.
[5]Jaderberg M, Simonyan K, Vedald A, et al.Reading text in the wild with convolutional neural networks[J]. International Journal of Computer Vision, 2016, 116(1): 1-20.
[6]Pennington J, Socher R, Manning C D. Glove: Global vectors for word representation[C]//Proceedings of the 2014 conference onempirical methods in natural language processing (EMNLP). Corpora: EMNLP,2014: 1532-1543.
[7]Peters M E, Neumann M, Iyyer M, et al. Deep contextualized word representations[DB/OL]. (2018-02-15)[2018-05-22]. https://arxiv.org/pdf/1802.05365.pdf.
[8]Radford A, Salimans T, Narasimhan K, et al. Improving language understanding by generative pre-training[DB/OL]. [2018-06-21]. https://www.cs.ubc.ca/~amuham01/LING530/papers/radford2018improving.pdf
[9]Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[DB/OL]. [2017-06-12]. https://arxiv.org/pdf/1706.03762.pdf.
[10] Devlin J, Chang M W, Lee K, et al. Bert: Pre-training of deep bidirectional transformers for language understading[DB/OL]. (2018-10-10)[2019-05-24]. https://arxiv.org/pdf/1810.04805.pdf
[11] 胡益淮. 基于XLNET的抽取式多级语义融合模型[J]. 通信技术, 2020(7):1630-1635.
Hu Yihuai. Extraction multi-level semantic fusion model based on XLNET[J]. Communications Technology, 2020(7):1630-1635.
[12] 梁召. 基于PLSA的大数据文本情感分析及其应用[D]. 成都:电子科技大学, 2016.
[13] Sun C, Qiu X, Xu Y, et al. How to fine-tune BERT for text classification[C]//China National Conference on Chinese Computational Linguistics. Springer, Cham, 2019: 194-206.
[14] Shin H C, Roth H R, Gao M, et al. Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning[J]. IEEE Transactions on Medical Imaging, 2016, 35(5): 1285-1298.