Optimizing Sentiment Analysis of Hotel Reviews Using PCA and Machine Learning for Tourism Business Decision Support

Authors

  • PUTRI TAQWA PRASETYANINGRUM Information Systems Study Program, Faculty of Information Technology, Universitas Mercu Buana Yogyakarta
  • Norshahila Ibrahim Universiti Pendidikan Sultan Idris, Tanjong Malim, Perak, Malaysia
  • Ozzi Suria Information Systems Study Program, Faculty of Information Technology, Universitas Mercu Buana Yogyakarta

Abstract

Sentiment analysis of hotel reviews provides valuable insights for improving customer satisfaction and service quality in the tourism industry. However, the high dimensionality and unstructured nature of review data pose challenges in extracting meaningful insights. This study optimizes sentiment analysis by applying Principal Component Analysis (PCA) for dimensionality reduction and utilizing machine learning models for classification. The proposed approach involves data preprocessing, feature selection using PCA, model training, and performance evaluation. Experimental results show that PCA enhances classification accuracy and computational efficiency by eliminating redundant features, improving sentiment prediction. The comparative analysis demonstrates that the Voting classifier achieves the highest accuracy (95.29%) and F-score (97.50%), while the BiLSTM-FNN model attains the highest recall (99.95%). These findings highlight the potential of PCA-based sentiment analysis in supporting data-driven decision-making for hotel management, enabling enhanced service quality, improved customer experience, and effective marketing strategies.

References

[1] D. Buhalis and R. Law, “Progress in information technology and tourism management: 20 years on and 10 years after the Internet—The state of eTourism research,” Tour Manag, vol. 29, no. 4, pp. 609–623, 2008.

[2] A. S. Cantallops and F. Salvi, “New consumer behavior: A review of research on eWOM and hotels,” Int J Hosp Manag, vol. 36, pp. 41–51, 2014.

[3] W. G. Kim, J. J. Li, and R. A. Brymer, “The impact of social media reviews on restaurant performance: The moderating role of excellence certificate,” Int J Hosp Manag, vol. 55, pp. 41–51, 2016.

[4] G. Onofrei, R. Filieri, and L. Kennedy, “Social media interactions, purchase intention, and behavioural engagement: The mediating role of source and content factors,” J Bus Res, vol. 142, no. December 2021, pp. 100–112, 2022, doi: 10.1016/j.jbusres.2021.12.031.

[5] R. Filieri, S. Alguezaui, and F. McLeay, “Why do travelers trust TripAdvisor? Antecedents of trust towards consumer-generated media and its influence on recommendation adoption and word of mouth,” Tour Manag, vol. 51, pp. 174–185, 2015.

[6] R. Feldman, “Techniques and applications for sentiment analysis,” Commun ACM, vol. 56, no. 4, pp. 82–89, 2013.

[7] B. Pang and L. Lee, “Opinion mining and sentiment analysis,” Foundations and Trends® in information retrieval, vol. 2, no. 1–2, pp. 1–135, 2008.

[8] E. Cambria, B. Schuller, Y. Xia, and C. Havasi, “New avenues in opinion mining and sentiment analysis,” IEEE Intell Syst, vol. 28, no. 2, pp. 15–21, 2013.

[9] W. Medhat, A. Hassan, and H. Korashy, “Sentiment analysis algorithms and applications: A survey,” Ain Shams engineering journal, vol. 5, no. 4, pp. 1093–1113, 2014.

[10] F. Hemmatian and M. K. Sohrabi, “A survey on classification techniques for opinion mining and sentiment analysis,” Artif Intell Rev, vol. 52, no. 3, pp. 1495–1545, 2019.

[11] A. R. Alaei, S. Becken, and B. Stantic, “Sentiment analysis in tourism: capitalizing on big data,” J Travel Res, vol. 58, no. 2, pp. 175–191, 2019.

[12] M. Thelwall, “Sentiment analysis for tourism,” Big Data and Innovation in Tourism, Travel, and Hospitality: Managerial Approaches, Techniques, and Applications, pp. 87–104, 2019.

[13] J. Liu, H. Mai, X. Zhao, and Z. Zhou, “Business tourism: a bibliometric visualization review (1994–2023),” Tourism Review, 2024.

[14] T. Wang et al., “A new perspective for computational social systems: Fuzzy modeling and reasoning for social computing in CPSS,” IEEE Trans Comput Soc Syst, vol. 11, no. 1, pp. 101–116, 2022.

[15] L. Andrades, C. Romero-Dexeus, and E. Martínez-Marín, “The Spanish Model for Smart Tourism Destination Management: A Methodological Approach,” 2024, Springer Nature.

[16] D. Gräbner, M. Zanker, G. Fliedl, and M. Fuchs, “Classification of customer reviews based on sentiment analysis,” in Information and communication technologies in tourism 2012, Springer, 2012, pp. 460–470.

[17] A. Adak, B. Pradhan, and N. Shukla, “Sentiment analysis of customer reviews of food delivery services using deep learning and explainable artificial intelligence: Systematic review,” Foods, vol. 11, no. 10, p. 1500, 2022.

[18] F. Sebastiani, “Machine learning in automated text categorization,” ACM computing surveys (CSUR), vol. 34, no. 1, pp. 1–47, 2002.

[19] H. Abdi and L. J. Williams, “Principal component analysis,” Wiley Interdiscip Rev Comput Stat, vol. 2, no. 4, pp. 433–459, 2010.

[20] I. T. Jolliffe and J. Cadima, “Principal component analysis: a review and recent developments,” Philosophical transactions of the royal society A: Mathematical, Physical and Engineering Sciences, vol. 374, no. 2065, p. 20150202, 2016.

[21] Q. Le and T. Mikolov, “Distributed representations of sentences and documents,” in International conference on machine learning, PMLR, 2014, pp. 1188–1196.

[22] J. Pennington, R. Socher, and C. D. Manning, “Glove: Global vectors for word representation,” in Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), 2014, pp. 1532–1543.

[23] S. Dey, S. Wasif, D. S. Tonmoy, S. Sultana, J. Sarkar, and M. Dey, “A comparative study of support vector machine and Naive Bayes classifier for sentiment analysis on Amazon product reviews,” in 2020 International Conference on Contemporary Computing and Applications (IC3A), IEEE, 2020, pp. 217–220.

[24] A. R. Isnain, J. Supriyanto, and M. P. Kharisma, “Implementation of K-Nearest Neighbor (K-NN) algorithm for public sentiment analysis of online learning,” IJCCS (Indonesian Journal of Computing and Cybernetics Systems), vol. 15, no. 2, pp. 121–130, 2021.

[25] K. M. O. Nahar, M. Alauthman, S. Yonbawi, and A. Almomani, “Cyberbullying Detection and Recognition with Type Determination Based on Machine Learning.,” Computers, Materials & Continua, vol. 75, no. 3, 2023.

[26] M. G. Hussain, B. Sultana, M. Rahman, and M. R. Hasan, “Comparison analysis of bangla news articles classification using support vector machine and logistic regression,” TELKOMNIKA (Telecommunication Computing Electronics and Control), vol. 21, no. 3, pp. 584–591, 2023.

[27] N. S. Sediatmoko, Y. Nataliani, and I. Suryady, “Suryady (Sentiment Analysis of Customer Review Using Classification Algorithms and SMOTE for Handling Imbalanced Class),” 2024.

[28] B. Fang, Q. Ye, D. Kucukusta, and R. Law, “Analysis of the perceived value of online tourism reviews: Influence of readability and reviewer characteristics,” Tour Manag, vol. 52, pp. 498–506, 2016.

[29] Y. Liu, M. Zhang, M. Hu, and T. Qiao, “How multi-actor resources create value for live streaming platforms: the mediating role of engaged spectators,” The Service Industries Journal, pp. 1–30, 2024.

[30] D. Zhang, P. Wu, and C. Wu, “The role of key online reviews in affecting online hotel booking: an empirical investigation,” Industrial Management & Data Systems, vol. 122, no. 2, pp. 499–520, 2021.

[31] L. I. U. Lei, S. U. Juan, and X. U. E. Xuanxuan, “Does the Digital Economy Optimize Tourism Industry Structure? Effects and Mechanisms Based on Quantile Regression and Threshold Modeling,” J Resour Ecol, vol. 15, no. 6, pp. 1692–1706, 2024.

[32] B. Liu, Sentiment analysis: Mining opinions, sentiments, and emotions. Cambridge university press, 2020.

[33] A. Ameur, S. Hamdi, and S. Ben Yahia, “Sentiment analysis for hotel reviews: a systematic literature review,” ACM Comput Surv, vol. 56, no. 2, pp. 1–38, 2023.

[34] A. Alghamdi, “A hybrid method for big data analysis using fuzzy clustering, feature selection and adaptive neuro-fuzzy inferences system techniques: case of Mecca and Medina hotels in Saudi Arabia,” Arab J Sci Eng, vol. 48, no. 2, pp. 1693–1714, 2023.

[35] Y. A. Singgalen, “Hotel Guest Length of Stay Prediction Using Random Forest Regressor,” Journal of Information Systems and Informatics, vol. 6, no. 4, pp. 3016–3034, 2024.

[36] A. Ullah, H. Biao, and A. Ullah, “Unveiling the Nexus Between Crises, Investor Sentiment, and Volatility of Tourism-Related Stocks: Empirical Findings From Pakistan,” Sage Open, vol. 14, no. 3, p. 21582440241256236, 2024.

[37] J. Liu, S. Hu, F. Mehraliyev, and H. Liu, “Text classification in tourism and hospitality–a deep learning perspective,” International Journal of Contemporary Hospitality Management, vol. 35, no. 12, pp. 4177–4190, 2023.

[38] L. Xinwei, Y. K. Tse, and F. Fastoso, “Unleashing the power of social media data in business decision making: an exploratory study,” Enterp Inf Syst, vol. 18, no. 1, p. 2243603, 2024.

[39] M. Alreahi et al., “Sustainable tourism in the post-cOvID-19 era: Investigating the effect of green practices on hotels attributes and customer preferences in Budapest, Hungary,” Sustainability, vol. 15, no. 15, p. 11859, 2023.

[40] A. M. Sarvazyan, “Machine-Generated Text Detection and Attribution,” 2023.

[41] M. Nilashi, S. Samad, B. Minaei-Bidgoli, F. Ghabban, and E. Supriyanto‬, “Online reviews analysis for customer segmentation through dimensionality reduction and deep learning techniques,” Arab J Sci Eng, vol. 46, no. 9, pp. 8697–8709, 2021.

[42] R. K. Mishra, S. Urolagin, J. A. A. Jothi, A. S. Neogi, and N. Nawaz, “Deep learning-based sentiment analysis and topic modeling on tourism during Covid-19 pandemic,” Front Comput Sci, vol. 3, p. 775368, 2021.

[43] J. Liu, S. Hu, F. Mehraliyev, and H. Liu, “Text classification in tourism and hospitality–a deep learning perspective,” International Journal of Contemporary Hospitality Management, vol. 35, no. 12, pp. 4177–4190, 2023.

[44] R. S. Rao, S. Dewangan, A. Mishra, and M. Gupta, “A study of dealing class imbalance problem with machine learning methods for code smell severity detection using PCA-based feature selection technique,” Sci Rep, vol. 13, no. 1, p. 16245, 2023.

[45] M. Nilashi, S. Samad, B. Minaei-Bidgoli, F. Ghabban, and E. Supriyanto‬, “Online reviews analysis for customer segmentation through dimensionality reduction and deep learning techniques,” Arab J Sci Eng, vol. 46, no. 9, pp. 8697–8709, 2021.

[46] M. Nilashi, S. Samad, B. Minaei-Bidgoli, F. Ghabban, and E. Supriyanto‬, “Online reviews analysis for customer segmentation through dimensionality reduction and deep learning techniques,” Arab J Sci Eng, vol. 46, no. 9, pp. 8697–8709, 2021.

[47] G. Sreenivas, K. M. Murthy, K. P. Gopali, N. Eedula, and H. R. Mamatha, “Sentiment Analysis of Hotel Reviews-a Comparative Study,” in 2023 IEEE 8th International Conference for Convergence in Technology (I2CT), IEEE, 2023, pp. 1–9.

[48] V. Sriguru and D. S. Rajathi, “An Optimized Bidirectional Convolutional Recurrent Neural Network Architecture With Group-Wise Enhancement Mechanism Of Sentiments For The Perspective Of Customer Review Summarization.,” ICTACT Journal on Soft Computing, vol. 15, no. 2, 2024.

[49] A. Ameur, S. Hamdi, and S. Ben Yahia, “Sentiment analysis for hotel reviews: a systematic literature review,” ACM Comput Surv, vol. 56, no. 2, pp. 1–38, 2023.

[50] Z. Yang, Y. Zhang, Y. Bai, and J. Shu, “The application of deep learning in pipeline inspection: current status and challenges,” Ships and Offshore Structures, pp. 1–12, 2024.

[51] V. Anand and A. K. Maurya, “A survey on recommender systems using graph neural network,” ACM Trans Inf Syst, vol. 43, no. 1, pp. 1–49, 2024.

[52] Y. A. Singgalen, “Hotel Guest Length of Stay Prediction Using Random Forest Regressor,” Journal of Information Systems and Informatics, vol. 6, no. 4, pp. 3016–3034, 2024.

[53] A. Alsayat, “Customer decision-making analysis based on big social data using machine learning: a case study of hotels in Mecca,” Neural Comput Appl, vol. 35, no. 6, pp. 4701–4722, 2023.

[54] R. Wang, E. Dong, Z. Cheng, Z. Liu, and X. Jia, “Transformer-based intelligent fault diagnosis methods of mechanical equipment: A survey,” Open Physics, vol. 22, no. 1, p. 20240015, 2024.

[55] W. Li, Y. Zhao, Y. Zhu, Z. Dong, F. Wang, and F. Huang, “Research progress in water quality prediction based on deep learning technology: a review,” Environmental Science and Pollution Research, vol. 31, no. 18, pp. 26415–26431, 2024.

[56] M. Thelwall, “Sentiment analysis for tourism,” Big data and innovation in tourism, travel, and hospitality: managerial approaches, techniques, and applications, pp. 87–104, 2019.

[57] N. M. Singh and S. K. Sharma, “An efficient automated multi-modal cyberbullying detection using decision fusion classifier on social media platforms,” Multimed Tools Appl, vol. 83, no. 7, pp. 20507–20535, 2024.

[58] K. M. O. Nahar, M. Alauthman, S. Yonbawi, and A. Almomani, “Cyberbullying Detection and Recognition with Type Determination Based on Machine Learning.,” Computers, Materials & Continua, vol. 75, no. 3, 2023.

[59] B. Fang, Q. Ye, D. Kucukusta, and R. Law, “Analysis of the perceived value of online tourism reviews: Influence of readability and reviewer characteristics,” Tour Manag, vol. 52, pp. 498–506, 2016.

[60] N. S. Sediatmoko, Y. Nataliani, and I. Suryady, “Suryady (Sentiment Analysis of Customer Review Using Classification Algorithms and SMOTE for Handling Imbalanced Class),” 2024.

[61] R. Drikvandi and O. Lawal, “Sparse principal component analysis for natural language processing,” Annals of data science, vol. 10, no. 1, pp. 25–41, 2023.

[62] N. M. Singh and S. K. Sharma, “An efficient automated multi-modal cyberbullying detection using decision fusion classifier on social media platforms,” Multimed Tools Appl, vol. 83, no. 7, pp. 20507–20535, 2024.

Downloads

Published

2025-08-23

How to Cite

PRASETYANINGRUM, P. T., Norshahila Ibrahim, & Ozzi Suria. (2025). Optimizing Sentiment Analysis of Hotel Reviews Using PCA and Machine Learning for Tourism Business Decision Support. Indonesian Journal of Information Systems, 8(1), 36–49. Retrieved from https://ojs.uajy.ac.id/index.php/IJIS/article/view/10978

Issue

Section

Articles