User-Generated Content Extraction: A Bibliometric Analysis of the Research Literature (2007–2022)

Ni Made Satvika Iswari, Nunik Afriliana, Suryasari


Scientific studies on user-generated content extraction began in 2007. User-generated content (UGC), which is all forms of content created by someone, is widely available on social media and can influence customer desire to shop. This study aims to systematically map research trends in the field of UGC extraction over the last 15 years using metadata taken from the Scopus database. Thus, novelties and opportunities will be found that will serve as a resource for researchers conducting research and determining the research theme. Bibliometric review analysis was carried out in this study by analyzing literature from year 2007 until 2022. The search using keywords related to UGC extraction resulted in 382 papers related to the specified keywords. The main findings of this study are 1) Research in the field of UGC extraction has emerged and has grown since 2007, 2) Research in this field has been conducted by researchers from various countries, mostly from China, followed by the United States, India, Italy, Germany, Spain, etc., 3) Several keywords were discussed in this field, which include UGC, sentiment analysis, opinion mining, social media, and information extraction. This bibliometric analysis has provided information on research opportunities/directions related to UGC extraction in the future. The originality of this study is that a bibliometric analysis was performed for the research trends in UGC with a focus on technical extraction. This topic is interesting to raise because mining and extracting knowledge from UGC is quite an expensive and labor-intensive undertaking.


Keywords: user-generated content, bibliometric analysis, research trend, country, co-occurrence.

Full Text:



SI J, LI Q, QIAN T, and DENG X. Users’ interest grouping from online reviews based on topic frequency and order. World Wide Web, 2013, 17(6): 1321-1342,

JUN S P, PARK D H, and YEOM J. The possibility of using search traffic information to explore consumer product attitudes and forecast consumer preference. Technological Forecasting and Social Change, 2014, 86: 237-253,

LU W, and STEPCHENKOVA S. User-Generated Content as a Research Mode in Tourism and Hospitality Applications: Topics, Methods, and Software. Journal of Hospitality Marketing and Management, 2015, 24(2): 119–154,

LIANG L J, CHOI H C, and JOPPE M. Understanding repurchase intention of Airbnb consumers: perceived authenticity, electronic word-of-mouth, and price sensitivity. Journal of Travel and Tourism Marketing, 2018, 35(1): 73-89,

LEUNG D, LAW R, VAN HOOF H, and BUHALIS D. Social Media in Tourism and Hospitality: A Literature Review. Journal of Travel and Tourism Marketing, 2013, 30(1–2): 3-22,

JAN B. et al. Deep learning in big data Analytics: A comparative study. Computers and Electrical Engineering, 2019, 75: 275-287,

SCIENCE C, MOGUERZA J M, MURGA J, et al. A Sentiment Analysis Software Framework for the Support of Business Information Architecture in the Tourist Sector. In HAMEURLAIN A, et al. (Eds.) Transactions on Large-Scale Data- and Knowledge-Centered Systems, 2020, XLV. Lect, 12390.

WEN H, PARK E, TAO C W, B. et al. Exploring user-generated content related to dining experiences of consumers with food allergies. International Journal of Hospitality Management, 2020, 85,

LIU Y, SOROKA A, HAN L, et al. Cloud-based big data analytics for customer insight-driven design innovation in SMEs. International Journal of Information Management, 2020, 51,

OPENREFINE OFFICIAL WEBSITE. (accessed Sep. 14, 2022).

VOSVIEWER OFFICIAL WEBSITE. (accessed Sep. 14, 2022).

POWER BI OFFICIAL WEBSITE. (accessed Sep. 14, 2022).

HAJOMER R, ELGEED H, ZAIDAN M, et al. Bibliometric Study of Pharmacy Practice Research in a High-Income Middle-Eastern Country: 15 Years Insight. Journal of Hunan University Natural Sciences, 2022, 49(1): 14-23,

KÖSEOGLU M A, MEHRALIYEV F, ALTIN M, and OKUMUS F. Competitor intelligence and analysis (CIA) model and online reviews: integrating big data text mining with network analysis for strategic analysis. Tourism Review, 2020, 76(3): 529-552,

TOKOPEDIA. (accessed Sep. 14, 2022).


BENGIO Y, COURVILLE A, and VINCENT P. Representation Learning: A Review and New Perspectives. 2012,

BALAKRISHNAN N, PELUSI D, and GANESAN S. Special issue on ‘Big Data Analytics and Deep Learning for E Business Outcomes. Information Systems and e-Business Management, 2020, 18(3): 281-282,

DONG J D, and YANG C H. Business value of big data analytics: A systems-theoretic approach and empirical test. Information and Management, 2020, 57(1): 103124,

KITSIOS F, KAMARIOTOU M, KARANIKOLAS P, and GRIGOROUDIS E. Digital Marketing Platforms and Customer Satisfaction: Identifying eWOM using Big Data and Text Mining. Applied Sciences (Switzerland), 2021, 11(17): 8032.


  • There are currently no refbacks.