LIUS, VINCENT (2024) IMPLEMENTASI NAIVE BAYES DENGAN PENDEKATAN LEKSIKON DAN TF-IDF DALAM ANALISIS SENTIMEN. S1 thesis, Universitas Mercu Buana Jakarta.
|
Text (HAL COVER)
01 COVER.pdf Download (567kB) | Preview |
|
|
Text (ABSTRAK)
02 ABSTRAK.pdf Download (62kB) | Preview |
|
Text (BAB I)
03 BAB 1.pdf Restricted to Registered users only Download (38kB) |
||
Text (BAB II)
04 BAB 2.pdf Restricted to Registered users only Download (157kB) |
||
Text (BAB III)
05 BAB 3.pdf Restricted to Registered users only Download (147kB) |
||
Text (BAB IV)
06 BAB 4.pdf Restricted to Registered users only Download (354kB) |
||
Text (BAB V)
07 BAB 5.pdf Restricted to Registered users only Download (25kB) |
||
Text (DAFTAR PUSTAKA)
08 DAFTAR PUSTAKA.pdf Restricted to Registered users only Download (137kB) |
||
Text (LAMPIRAN)
09 LAMPIRAN.pdf Restricted to Registered users only Download (325kB) |
Abstract
In the digital era marked by the increased use of social media as a platform for public expression, accurate sentiment analysis has become crucial, particularly in evaluating app reviews. This research highlights the need for precise sentiment classification, driven by the prevalence of negative feedback in several app reviews. To address this challenge, we propose a combined approach that integrates the Naive Bayes algorithm with lexicon-based sentiment labeling and TF-IDF weighting for model training. Utilizing a dataset of 5000 reviews from Kaggle regarding app reviews in Indonesia, this study explores Indonesian lexicons, specifically InSet and SentiStrengthID, for sentiment labeling. Evaluating the effectiveness of combining Naive Bayes with TFIDF and lexicon-based methods provides significant insights into sentiment analysis in the context of app reviews. Based on our evaluation, we concluded that the SentiStrength lexicon performs better on all parameters. The model using the SentiStrengthID lexicon achieved an accuracy of 86%, with a precision of 85%, recall of 86%, and F1-score of 85%, which is better compared to the InSet lexicon, which had an accuracy of 70%, precision of 80%, recall of 70%, and F1-score of 71%. Keywords: Sentiment Analysis, Lexicon InSet, Lexicon SentiStrengthID, Naive Bayes, App Review. Di era digital dengan peningkatan penggunaan media sosial sebagai platform ekspresi publik, analisis sentimen yang akurat menjadi semakin penting, terutama untuk evaluasi ulasan aplikasi. Penelitian ini menekankan pentingnya klasifikasi sentimen yang akurat, terutama karena banyaknya ulasan aplikasi yang negatif. Kami mengusulkan pendekatan kombinasi yang mengintegrasikan algoritma Naive Bayes dengan pelabelan sentimen berbasis leksikon dan pembobotan TF-IDF untuk pelatihan model. Melalui penggunaan dataset yang terdiri dari 5000 ulasan diambil dari Kaggle mengenai ulasan aplikasi di Indonesia, penelitian ini mengeksplorasi leksikon bahasa Indonesia, khususnya InSet dan SentiStrengthID. Evaluasi efektivitas penggabungan Naive Bayes dengan TF-IDF dan metode berbasis leksikon menghasilkan kontribusi signifikan dalam pemahaman yang lebih mendalam tentang analisis sentimen. Dari evaluasi yang kami lakukan, kami mencapai kesimpulan bahwa leksikon SentiStrength menunjukkan kinerja yang lebih baik di semua parameter. Model yang menggunakan leksikon SentiStrengthID mencapai akurasi sebesar 86%, dengan nilai presisi 85%, recall 86%, dan F1-score 85%, yang lebih baik dibandingkan dengan penggunaan leksikon InSet, yang memiliki akurasi 70%, presisi 80%, recall 70%, dan F1-score 71%. Kata kunci: Analisis Sentimen, Leksikon InSet, Leksikon SentiStrengthID, Naive bayes, Ulasan Aplikasi.
Actions (login required)
View Item |