NOORLY, AKASYAH SAVINA (2026) PERBANDINGAN KINERJA ALGORITMA MACHINE LEARNING DALAM ANALISIS SENTIMEN BERBASIS TOPIK TERHADAP TWEET SERIAL BLACK MIRROR. S1 thesis, Universitas Mercu Buana Jakarta.
|
Text (HAL COVER)
Cover.pdf Download (1MB) | Preview |
|
|
Text (BAB I)
BAB 1.pdf Restricted to Registered users only Download (485kB) |
||
|
Text (BAB II)
BAB 2.pdf Restricted to Registered users only Download (725kB) |
||
|
Text (BAB III)
BAB 3.pdf Restricted to Registered users only Download (541kB) |
||
|
Text (BAB IV)
BAB 4.pdf Restricted to Registered users only Download (1MB) |
||
|
Text (BAB V)
BAB 5.pdf Restricted to Registered users only Download (472kB) |
||
|
Text (DAFTAR PUSTAKA)
Daftar Pustaka.pdf Restricted to Registered users only Download (360kB) |
||
|
Text (LAMPIRAN)
Lampiran.pdf Restricted to Registered users only Download (1MB) |
Abstract
Social media platform X (Twitter) has become a primary venue for public expression regarding popular culture phenomena, including the television series Black Mirror. However, conventional sentiment analysis often fails to capture the specific context of the discussion topics. This study aims to conduct topic-based sentiment analysis to map public opinion more deeply and compare the performance of three machine learning algorithms: Logistic Regression, Support Vector Machine (SVM), and Extreme Gradient Boosting (XGBoost). The research method involves data preprocessing, topic modeling using Latent Dirichlet Allocation (LDA), automated sentiment labeling using RoBERTa with genre-rule adjustments, and sentiment classification. The dataset consists of 6,497 English tweets. The topic modeling results successfully identified five main discussion topics (Topic 0 to Topic 4). Sentiment distribution shows a dominance of Neutral responses (40.6%), followed by Negative (32%) which is dominant in Topic 0, and Positive (27.5%) which is dominant in Topic 1. In the algorithm performance comparison, XGBoost proved to be the best model with an accuracy of 95%, outperforming Logistic Regression (93%) and SVM (92%). This study concludes that combining LDA for topic modeling and XGBoost with contextual embedding features is a highly effective approach for analyzing public opinion on complex social media data. Keywords: Sentiment Analysis, Topic Modeling, LDA, XGBoost, Black Mirror, RoBERTa. Media sosial X (Twitter) telah menjadi wadah utama bagi masyarakat untuk mengekspresikan opini terhadap fenomena budaya populer, termasuk serial televisi Black Mirror. Namun, analisis sentimen konvensional sering kali gagal menangkap konteks spesifik dari topik yang dibicarakan. Penelitian ini bertujuan untuk melakukan analisis sentimen berbasis topik guna memetakan opini publik secara lebih mendalam dan membandingkan kinerja tiga algoritma machine learning: Logistic Regression, Support Vector Machine (SVM), dan Extreme Gradient Boosting (XGBoost). Metode penelitian meliputi preprocessing data, pemodelan topik menggunakan Latent Dirichlet Allocation (LDA), pelabelan sentimen otomatis menggunakan RoBERTa dengan penyesuaian aturan genre, serta klasifikasi sentimen. Dataset yang digunakan terdiri dari 6.497 tweet berbahasa Inggris. Hasil pemodelan topik berhasil mengidentifikasi lima topik diskusi utama (Topik 0 hingga Topik 4). Distribusi sentimen menunjukkan dominasi respons Netral (40,6%), diikuti Negatif (32%) yang dominan pada Topik 0, dan Positif (27,5%) yang dominan pada Topik 1. Dalam komparasi kinerja algoritma, XGBoost terbukti menjadi model terbaik dengan akurasi mencapai 95%, mengungguli Logistic Regression (93%) dan SVM (92%). Penelitian ini menyimpulkan bahwa penggabungan LDA untuk pemodelan topik dan XGBoost dengan fitur embedding kontekstual merupakan pendekatan yang sangat efektif untuk menganalisis opini publik pada data media sosial yang kompleks. Kata kunci: Analisis Sentimen, Topic Modeling, LDA, XGBoost, Black Mirror, RoBERTa.
Actions (login required)
![]() |
View Item |
