Abstract:In order to improve the accuracy of gender recognition for social network users, the text features and emoticon features of a single user are fused to identify the user's gender, and then the interactive feature information of multiple users is extracted to further improve the accuracy of gender recognition. The experimental results show that the accuracy of user gender recognition is improved by 6.8% after the fusion of multi-user interaction features. It shows that emoticons and multi-user interaction features are very helpful to improve the accuracy of user gender identification, and improve the accuracy of gender information identification of social network users.
王浩, 许小可. 融合文本和表情符号特征的社交网络用户性别识别[J]. 复杂系统与复杂性科学, 2022, 19(4): 17-24.
WANG Hao, XU Xiaoke. Social Network User Gender Recognition by Combining Text and Emoji Features. Complex Systems and Complexity Science, 2022, 19(4): 17-24.
[1] 宋巍, 刘丽珍, 王函石. 基于兴趣偏好的微博用户性别推断研究[J]. 电子学报, 2016, 44(10):25222529. SONG W, LIU L Z, WANG H S. User interest preferences for gender inference on microblog[J]. Acta Electronica Sinica, 2016, 44(10):25222529. [2] RUI G, JING Q, ZHANG G. Web-based Chinese term extraction in the field of study[C]. 2015 11th International Conference on Semantics, Knowledge and Grids (SKG). Beijing, China, 2016. [3] WANG Y, TANG Y, MA J, et al. Gender prediction based on data streams of smartphone applications[J]. Lecture Notes in Computer Science, 2015, 64(6):115125. [4] XIAO C, FAN Z, YUE W. Predicting audience gender in online content-sharing social networks[J]. Journal of the Association for Information Science & Technology, 2013, 64(6):12841297. [5] ALOWIBDI J S, BUY U A, YU P. Language independent gender classification on twitter[C]. Proceedings of the 2013 IEEE/ACM international conference on advances in social networks analysis and mining. New York, USA: Association for Computing Machinery, 2013:739743. [6] BAMMAN D, EISENSTEIN J, SCHNOEBELEN T. Gender identity and lexical variation in social media[J]. Journal of Sociolingus, 2014, 18(2):135160. [7] BRIAN Z M, HU D W. Gender prediction on twitter using stream algorithms with n-gram character features[J]. International Journal of Intelligence Science, 2012, 2(4A):143148. [8] NEWMAN M L. Gender differences in language use: an analysis of 14 000 text samples[J]. Discourse Processes, 2008, 45(3):211236. [9] 刘宝芹, 牛耘. 基于情绪特征的中文微博用户性别识别[J]. 计算机工程与科学, 2016, 38(9):19171923. LIU B Q, LIU Y. Gender recognition of Chinese micro-blog users based on emotion features[J]. Computer Engineering & Science, 2016, 38(9):19171923. [10] BARBIERI F, RONZANO F, SAGGION H. What does this emoji mean? a vector space skip-gram model for twitter emojis[C]. Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016). Paris, France: European Language Resources Association, 2016: 39673972. [11] RODRIGUES D, LOPES D, PRADA M, et al. A frown emoji can be worth a thousand words: perceptions of emoji use in text messages exchanged between romantic partners[J]. Telematics and Informatics, 2017, 34(8):15321543. [12] YANG X, LIU M. The pragmatics of text-emoji co-occurrences on Chinese social media[J]. Pragmatics, 2020, 31(12):129. [13] BUTTERWORTH S E, GIULIANO TA, WHITE J, et al. Sender gender influences emoji interpretation in text messages[J]. Frontiers in Psychology, 2019, 10:784. [14] MUKHERJEE S, BALA P K. Gender classification of microblog text based on authorial style[J]. Information systems and e-business management: special issue on emerging technologies for e-business engineering, 2017, 15(1):117138. [15] MONTERO C S, MUNEZERO M, Kakkonen T. Investigating the role of emotion-based features in author gender classification of text[C]. Proceedings of the International Conference on Intelligent Text Processing and Computational Linguistics. Kathmandu, Nepal, 2014:98114. [16] BURGER J D, HENDERSO J C, KIM G, et al. Discriminating gender on twitter[C]. Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing. Pennsylvania. USA: Association for Computational Linguistics, 2011: 13011309. [17] 王晶晶, 李寿山, 黄磊. 中文微博用户性别分类方法研究[J]. 中文信息学报, 2014, 28(6):150155. WANG J J, LI S S, HUANG L. User gender classification in chinese microblog[J]. Journal of Chinese Information Processing, 2014, 28(6):150155. [18] MCSHANE L, PANCER E, POOLE M, et al. Emoji, playfulness, and brand engagement on twitter[J]. Journal of Interactive Marketing, 2021, 53(3):96110. [19] KELLY R, WATTS L. Characterising the inventive appropriation of emoji as relationally meaningful in mediated close personal relationships[EB/OL]. [20220316]. http://opus. bath. ac. uk/46780. [20] PRADA M, RODRIGUES D L, GARRIDO M V, et al. Motives, frequency and attitudes toward emoji and emoticon use[J]. Telematics and Informatics, 2018, 35(7):19251934. [21] LI S, RUI X, ZONG C, et al. A framework of feature selection methods for text categorization[C]. Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. Pennsylvania, USA: Association for Computational Linguistics, 2009: 692700. [22] 刘伟朋, 陈雁翔, 孙晓. 基于表情符号的中文微博多维情感分类的研究[J]. 合肥工业大学学报(自然科学版), 2014, 37(7):803807. LIU W P, CHEN Y X, SUN X. Multidimensional sentiment classification method of Chinese micro-blog based on the emoticon[J]. Journal of Hefei University of Technology (Natural Science), 2014, 37(7):803807. [23] MILLER H, THEBAULT-SPIEKER J, CHANG S, et al. “Blissfully happy” or “ready tofight”: varying interpretations of emoji[C]. International AAAI Conference on Web and Social Media. Cologne, Germany, 2016: 259268.