首页 | 本学科首页   官方微博 | 高级检索  
     检索      


An empirical comparison of machine learning methods for text-based sentiment analysis of online consumer reviews
Institution:1. Paul Merage School of Business, University of California, Irvine, United States;2. UCL School of Management, University College London, London, United Kingdom;3. Donald Bren School of Information and Computer Sciences, University of California, Irvine, United States;1. Rennes School of Business, 2 Rue Robert D''Arbrissel – CS 76522, 35065 Rennes, France;2. Grenoble Ecole de Management, 12 Rue Pierre Semard, 38000 Grenoble, France;1. University of Amsterdam Business School, Plantage Muidergracht 12, 1018 TV Amsterdam, the Netherlands;2. Tilburg University, Warandelaan 2, 5037 AB Tilburg, the Netherlands;3. UNC-Chapel Hill, Chapel Hill, NC, USA;1. Rotman School of Management at the University of Toronto, 105 St. George Street, Toronto, Ontario, Canada;2. China Europe International Business School, 699 Hongfeng Road, Pudong, Shanghai 201206, China;1. Essec Business School, 3 Av. Bernard Hirsch, 95000 Cergy, France;2. Ivey Business School, Western University, 1255 Western Rd, London, ON N6G 0N1, Canada;3. Carson College of Business, Washington State University, 300 NE College Ave, Pullman, WA 99163, United States;1. The Interdisciplinary Center (IDC), Israel;2. Hebrew University of Jerusalem, Israel;3. New York University, USA and The Interdisciplinary Center (IDC), Israel
Abstract:The amount of digital text-based consumer review data has increased dramatically and there exist many machine learning approaches for automated text-based sentiment analysis. Marketing researchers have employed various methods for analyzing text reviews but lack a comprehensive comparison of their performance to guide method selection in future applications. We focus on the fundamental relationship between a consumer’s overall empirical evaluation, and the text-based explanation of their evaluation. We study the empirical tradeoff between predictive and diagnostic abilities, in applying various methods to estimate this fundamental relationship. We incorporate methods previously employed in the marketing literature, and methods that are so far less common in the marketing literature. For generalizability, we analyze 25,241 products in nine product categories, and 260,489 reviews across five review platforms. We find that neural network-based machine learning methods, in particular pre-trained versions, offer the most accurate predictions, while topic models such as Latent Dirichlet Allocation offer deeper diagnostics. However, neural network models are not suited for diagnostic purposes and topic models are ill equipped for making predictions. Consequently, future selection of methods to process text reviews is likely to be based on analysts’ goals of prediction versus diagnostics.
Keywords:Automated text analysis  Sentiment analysis  Online reviews  User generated content  Machine learning  Natural language processing
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号