Pengaruh Jenis Stopwords terhadap Akurasi Model Multinomial Naïve Bayes dalam Proses Sentimen Analisis

Authors

  • Jimmy Tjen Program Studi Informatika, Fakultas Teknologi Informasi, Universitas Widya Dharma Pontianak

Keywords:

customer reviews, Multinomial Naïve Bayes (MNB), Sentiment Analysis (SA), stopword, ulasan pelanggan

Abstract

Penerapan dari machine learning dalam bisnis telah memungkinkan produsen atau penjual untuk mengetahui kualitas produk dagangan mereka berdasarkan pada analisis ulasan pelanggan menggunakan Sentiment Analysis (SA). Penelitian ini bertujuan untuk mengetahui pengaruh dari jenis stopword terhadap akurasi dari metode Multinomial Naïve Bayes (MNB) dalam proses SA. Terdapat 10 jenis stopword yang digunakan dalam penelitian ini: umum, konjungsi, bahasa gaul, keterangan waktu, kata benda, kata ganti orang, kata seruan, kata kerja, dan kata dengan satu huruf. Berdasarkan pada uji Friedman pada tiga ulasan dari tiga produk sepatu, diketahui bahwa menghilangkan stopword konjungsi (MNB-konjungsi) dapat meningkatkan akurasi model MNB dalam proses SA sebesar 1%. Hasil uji T pada dua dari tiga himpunan data menunjukkan bahwa MNB-konjungsi memiliki akurasi yang lebih baik ketimbang MNB tanpa menghilangkan stopword. 

References

M. K. Gourisaria, R. Agrawal, G. Harshvardhan, M. Pandey, and S. S. Rautaray, “Application of Machine Learning in Industry 4.0,” Studies in Big Data, pp. 57–87, 2021.

T. V. N. Rao, A. Gaddam, M. Kurni, and K. Saritha, “Reliance on Artificial Intelligence, Machine Learning and Deep Learning in the Era of Industry 4.0,” Smart Healthcare System Design: Security and Privacy Aspects, 2021, pp. 281–299. doi: 10.1002/9781119792253.ch12.

T. H. Gan, J. Kanfoud, H. Nedunuri, A. Amini, and G. Feng, “Industry 4.0: Why Machine Learning Matters?,” Lecture Notes in Mechanical Engineering (LNME), pp. 397–404, 2021, doi: 10.1007/978-981-15-9199-0_37.

M. Wankhade, A. C. S. Rao, and C. Kulkarni, “A survey on sentiment analysis methods, applications, and challenges,” Artificial Intelligence Review, vol. 55, no. 55, pp. 5731–5780, 2022, doi: 10.1007/s10462-022-10144-1.

P. Nandwani and R. Verma, “A review on sentiment analysis and emotion detection from text,” Social Network Analysis and Mining, vol. 11, no. 1, p. 81, 2021, doi: 10.1007/s13278-021-00776-6.

M. Birjali, M. Kasri, and A. Beni-Hssane, “A comprehensive survey on sentiment analysis: Approaches, challenges and trends,” Knowledge-Based Systems, vol. 226, no. 1, p. 107134, 2021, doi: 10.1016/j.knosys.2021.107134.

Z. A. Diekson, M. R. B. Prakoso, M. S. Q. Putra, M. S. A. F. Syaputra, S. Achmad, and R. Sutoyo, “Sentiment analysis for customer review: Case study of Traveloka,” Procedia Computer Science, vol. 216, pp. 682–690, 2023, doi: 10.1016/j.procs.2022.12.184.

H. Huang, A. A. Zavareh, and M. B. Mustafa, “Sentiment Analysis in E-Commerce Platforms: A Review of Current Techniques and Future Directions,” IEEE Access, vol. 11, pp. 90367–90382, 2023, doi: 10.1109/ACCESS.2023.3307308.

M. A. Palomino and F. Aider, “Evaluating the Effectiveness of Text Pre-Processing in Sentiment Analysis,” Applied Sciences, vol. 12, no. 17, p. 8765, 2022, doi: 10.3390/app12178765.

A. A. Wazrah and S. Alhumoud, “Sentiment Analysis Using Stacked Gated Recurrent Unit for Arabic Tweets,” IEEE Access, vol. 9, pp. 137176–137187, 2021, doi: 10.1109/ACCESS.2021.3114313.

T. H. Jaya Hidayat, Y. Ruldeviyani, A. R. Aditama, G. R. Madya, A. W. Nugraha, and M. W. Adisaputra, “Sentiment analysis of twitter data related to Rinca Island development using Doc2Vec and SVM and logistic regression as classifier,” Procedia Computer Science, vol. 197, pp. 660–667, 2022, doi: 10.1016/j.procs.2021.12.187.

Y. S. Mehanna and M. Bin Mahmuddin, “A Semantic Conceptualization Using Tagged Bag-of-Concepts for Sentiment Analysis,” IEEE Access, vol. 9, pp. 118736–118756, 2021.

A. Santosa, I. Purnamasari, and R. Mayasari, “Pengaruh Stopword Removal dan Stemming Terhadap Performa Klasifikasi Teks Komentar Kebijakan New Normal Menggunakan Algoritma LSTM,” J-SAKTI (Jurnal Sains Komputer dan Informatika), vol. 6, no. 1, pp. 81–93, 2022.

U. D. Gandhi, P. M. Kumar, G. C. Babu, and G. Karthick, “Sentiment Analysis on Twitter Data by Using Convolutional Neural Network (CNN) and Long Short Term Memory (LSTM),” Wireless Personal Communications, 2021, doi: 10.1007/s11277-021-08580-3.

M. Z. Ali, Ehsan-Ul-Haq, S. Rauf, K. Javed, and S. Hussain, “Improving Hate Speech Detection of Urdu Tweets Using Sentiment Analysis,” IEEE Access, vol. 9, pp. 84296–84305, 2021, doi: 10.1109/ACCESS.2021.3087827.

K. L. Tan, C. P. Lee, K. M. Lim, and K. S. M. Anbananthen, “Sentiment Analysis With Ensemble Hybrid Deep Learning Model,” IEEE Access, vol. 10, pp. 103694–103704, 2022, doi: 10.1109/ACCESS.2022.3210182.

C. P. D. Cyril, J. R. Beulah, N. Subramani, P. Mohan, A. Harshavardhan, and D. Sivabalaselvamani, “An automated learning model for sentiment analysis and data classification of Twitter data using balanced CA-SVM,” Concurrent Engineering, vol. 29, no. 4, pp. 386–395, 2021, doi: 10.1177/1063293X211031485.

R. Setiawan and J. Tjen, “Identifying Opinions of Footwear Products in Indonesia via Sentiment Analysis.” International Conference on Digital Business Innovation and Technology Management (ICONBIT), vol. 1, no. 1, pp. 561-565, 2024.

K. Kaushik et al., “Multinomial Naive Bayesian Classifier Framework for Systematic Analysis of Smart IoT Devices,” Sensors, vol. 22, no. 19, p. 7318, 2022, doi: 10.3390/s22197318.

T.-T. Wong and H.-C. Tsai, “Multinomial naïve Bayesian classifier with generalized Dirichlet priors for high-dimensional imbalanced data,” Knowledge-Based Systems, vol. 228, p. 107288, 2021, doi: 10.1016/j.knosys.2021.107288.

N. E. Allali, M. Fariss, H. Asaidi, and M. Bellouki, “Multinomial Naive Bayes Categorization for Semantic Web Services,” in 2021 International Conference on Digital Age & Technological Advances for Sustainable Development (ICDATA), IEEE, 2021, pp. 74–79. doi: 10.1109/ICDATA52997.2021.00023.

A. Askari, A. D’, A. Laurent, and E. Ghaoui, “Naive Feature Selection: Sparsity in Naive Bayes,” 2020. [Online]. Available: https://github.com/aspremon/NaiveFeatureSelection.

S. Xu, Y. Li, and Z. Wang, “Bayesian Multinomial Naïve Bayes Classifier to Text Classification,” Lecture notes in mechanical engineering, 2017, pp. 347–352. doi: 10.1007/978-981-10-5041-1_57.

G. Singh, B. Kumar, L. Gaur, and A. Tyagi, “Comparison between Multinomial and Bernoulli Naïve Bayes for Text Classification,” in 2019 International Conference on Automation, Computational and Technology Management (ICACTM), IEEE, 2019, pp. 593–596. doi: 10.1109/ICACTM.2019.8776800.

J. Cui, Z. Wang, S.-B. Ho, and E. Cambria, “Survey on sentiment analysis: evolution of research methods and topics,” Artificial Intelligence Review, vol. 56, no. 8, pp. 8469–8510, 2023, doi: 10.1007/s10462-022-10386-z.

C. Therence and J. Tjen, “Analysis of Cosmetic Product Opinions on E-Commerce Based on Naïve Bayes Classifier.”, International Conference on Digital Business Innovation and Technology Management (ICONBIT), vol. 1, no. 1, pp. 328-334, 2024.

I. Salim and J. Tjen, “A Comparison of Online Investment Application Opinion Based on Sentiment Analysis.” International Conference on Digital Business Innovation and Technology Management (ICONBIT), vol. 1, no. 1, pp. 541-547, 2024.

J. Ma et al., “Metaheuristic-based support vector regression for landslide displacement prediction: a comparative study,” Landslides, vol. 19, no. 10, pp. 2489–2511, 2022, doi: 10.1007/s10346-022-01923-6.

Downloads

Published

2025-04-01