ISTANTO, YUDHA ANDIKA (2026) ANALISIS SENTIMEN WACANA REDENOMINASI RUPIAH DENGAN LEXICON-BASED DAN PERBANDINGAN SVM, RANDOM FOREST, LOGISTIC REGRESSION. S1 thesis, Universitas Mercu Buana Jakarta.
|
Text (HAL COVER)
Cover.pdf Download (619kB) | Preview |
|
|
Text (BAB I)
BAB 1.pdf Restricted to Registered users only Download (96kB) |
||
|
Text (BAB II)
BAB 2.pdf Restricted to Registered users only Download (316kB) |
||
|
Text (BAB III)
BAB 3.pdf Restricted to Registered users only Download (152kB) |
||
|
Text (BAB IV)
BAB 4.pdf Restricted to Registered users only Download (891kB) |
||
|
Text (BAB V)
BAB 5.pdf Restricted to Registered users only Download (61kB) |
||
|
Text (DAFTAR PUSTAKA)
Daftar Pustaka.pdf Restricted to Registered users only Download (133kB) |
||
|
Text (LAMPIRAN)
Lampiran.pdf Restricted to Registered users only Download (378kB) |
Abstract
This study analyzes public sentiment toward the discourse on the 2025 rupiah redenomination on Twitter using a lexicon-based approach with the InSet Lexicon and a comparison of machine learning algorithms (Support Vector Machine, Random Forest, and Logistic Regression). Data were collected through web scraping, yielding 3,559 tweets, and after deduplication, 1,982 unique tweets were obtained. Preprocessing was conducted through six stages: cleaning, case folding, normalization, tokenization, stopword removal, and stemming using Sastrawi. Sentiment labeling based on the InSet Lexicon produced a distribution of 794 positive tweets (40.06%), 630 negative tweets (31.79%), and 558 neutral tweets (28.15%), indicating that public sentiment tends to be optimistic toward this policy. Feature extraction using TF-IDF generated 1,000 unique features. Class imbalance was addressed using the Synthetic Minority Over-sampling Technique (SMOTE). The evaluation results show that SMOTE improved the performance of all models: SVM achieved the highest accuracy of 75% (an increase of 7%), outperforming Random Forest (72%, +5%) and Logistic Regression (71%, +2%). SVM was proven to be the best-performing model, achieving a macro-average F1-score of 0.75 and balanced performance across all sentiment classes. This study contributes to the mapping of public opinion based on social media data and demonstrates the effectiveness of combining lexicon-based methods and machine learning with imbalanced data handling for Indonesian sentiment analysis. Keywords: Sentiment analysis, rupiah redenomination, SVM, random forest, logistic regression Penelitian ini menganalisis sentimen masyarakat terhadap wacana redenominasi rupiah 2025 di Twitter menggunakan pendekatan lexicon-based InSet Lexicon dan perbandingan algoritma machine learning (SVM, Random Forest, Logistic Regression). Data dikumpulkan melalui web scraping menghasilkan 3.559 tweet, setelah deduplikasi diperoleh 1.982 tweet unik. Preprocessing dilakukan melalui enam tahapan: cleaning, case folding, normalisasi, tokenizing, stopword removal, dan stemming Sastrawi. Pelabelan InSet Lexicon menghasilkan distribusi sentimen positif 794 tweet (40,06%), negatif 630 tweet (31,79%), dan netral 558 tweet (28,15%), menunjukkan masyarakat cenderung optimis terhadap kebijakan ini. Ekstraksi fitur TF-IDF menghasilkan 1.000 fitur unik. Ketidakseimbangan kelas diatasi menggunakan SMOTE. Hasil evaluasi menunjukkan SMOTE meningkatkan performa semua model: SVM mencapai akurasi tertinggi 75% (peningkatan +7%), mengungguli Random Forest (72%, +5%) dan Logistic Regression (71%, +2%). SVM terbukti sebagai model terbaik dengan macro average F1-score 0,75 dan performa seimbang pada seluruh kelas. Penelitian ini berkontribusi dalam pemetaan opini publik berbasis data media sosial dan membuktikan efektivitas kombinasi lexicon-based dan machine learning dengan penanganan imbalanced data untuk analisis sentimen bahasa Indonesia. Kata kunci: Analisis sentimen, redenominasi rupiah, SVM, random forest, logistic regression
Actions (login required)
![]() |
View Item |
