KLASIFIKASI SENTIMEN TWEET TENTANG SICEPAT DENGAN STRATEGI PENYEIMBANGAN DATA

GHIFARY, RIFQI AL (2025) KLASIFIKASI SENTIMEN TWEET TENTANG SICEPAT DENGAN STRATEGI PENYEIMBANGAN DATA. S1 thesis, Universitas Mercu Buana Jakarta.

[img]
Preview
Text (HAL COVER)
01 COVER.pdf

Download (440kB) | Preview
[img] Text (BAB I)
02 BAB 1.pdf
Restricted to Registered users only

Download (102kB)
[img] Text (BAB II)
03 BAB 2.pdf
Restricted to Registered users only

Download (211kB)
[img] Text (BAB III)
04 BAB 3.pdf
Restricted to Registered users only

Download (160kB)
[img] Text (BAB IV)
05 BAB 4.pdf
Restricted to Registered users only

Download (576kB)
[img] Text (BAB V)
06 BAB 5.pdf
Restricted to Registered users only

Download (41kB)
[img] Text (DAFTAR PUSTAKA)
07 DAFTAR PUSTAKA.pdf
Restricted to Registered users only

Download (207kB)
[img] Text (LAMPIRAN)
08 LAMPIRAN.pdf
Restricted to Registered users only

Download (1MB)

Abstract

This study aims to develop an automatic sentiment classification system for Indonesian-language tweets related to SiCepat delivery services. A total of 15,000 tweets were collected using Tweet-Harvest and processed through stages of preprocessing, stemming, and automatic labeling based on a weighted lexicon. The main issue encountered was class imbalance, where positive tweets significantly outnumbered negative ones. To address this, balancing strategies such as SMOTE, undersampling, and a combination of both were applied. Three machine learning algorithms were tested: Support Vector Machine (SVM), Naive Bayes (NB), and Logistic Regression (LR). The models were evaluated using accuracy, precision, recall, F1-score, and confusion matrix. Results show that SVM with SMOTE achieved the best performance (90.5% accuracy and 0.907 F1-score), followed by Logistic Regression with a combined balancing approach (89.2% accuracy). Naive Bayes tended to be biased toward the majority class. Overall, the combined data balancing approach with SVM proved to be the most effective and is recommended for sentiment analysis implementation in the logistics industry. Keywords: Sentiment Analysis, Machine Learning, Class Imbalance, SMOTE, SVM, Logistic Regression, Naive Bayes, Twitter, SiCepat. Penelitian ini bertujuan membangun sistem klasifikasi sentimen otomatis terhadap tweet berbahasa Indonesia yang membahas layanan SiCepat. Sebanyak 15.000 data dikumpulkan melalui Tweet-Harvest dan diproses melalui tahap preprocessing, stemming, dan pelabelan otomatis berbasis lexicon berbobot. Masalah utama adalah ketidakseimbangan kelas, di mana tweet positif jauh lebih banyak daripada negatif. Untuk mengatasinya, digunakan strategi penyeimbangan seperti SMOTE, undersampling, dan kombinasi keduanya. Tiga algoritma pembelajaran mesin yang diuji adalah Support Vector Machine (SVM), Naive Bayes (NB), dan Logistic Regression (LR). Evaluasi dilakukan menggunakan akurasi, precision, recall, F1-score, dan confusion matrix. Hasil menunjukkan SVM dengan SMOTE memiliki kinerja terbaik (akurasi 90,5% dan F1-score 0,907), disusul Logistic Regression kombinasi (akurasi 89,2%). Naive Bayes cenderung bias terhadap kelas mayoritas. Pendekatan kombinasi data dengan SVM terbukti paling efektif dan direkomendasikan untuk implementasi analisis sentimen di industri logistik. Kata kunci: Analisis Sentimen, SMOTE, Logistic Regression, SVM, Naive Bayes, Ketimpangan Kelas, SiCepat, Tweet-Harvest, Media Sosial

Item Type: Thesis (S1)
Call Number CD: FIK/INFO. 25 130
NIM/NIDN Creators: 41520010023
Uncontrolled Keywords: Analisis Sentimen, SMOTE, Logistic Regression, SVM, Naive Bayes, Ketimpangan Kelas, SiCepat, Tweet-Harvest, Media Sosial
Subjects: 000 Computer Science, Information and General Works/Ilmu Komputer, Informasi, dan Karya Umum > 000. Computer Science, Information and General Works/Ilmu Komputer, Informasi, dan Karya Umum > 004 Data Processing, Computer Science/Pemrosesan Data, Ilmu Komputer, Teknik Informatika
000 Computer Science, Information and General Works/Ilmu Komputer, Informasi, dan Karya Umum > 000. Computer Science, Information and General Works/Ilmu Komputer, Informasi, dan Karya Umum > 006 Special Computer Methods/Metode Komputer Tertentu > 006.3 Artificial Intelligence/Kecerdasan Buatan > 006.31 Machine Learning/Pembelajaran Mesin
000 Computer Science, Information and General Works/Ilmu Komputer, Informasi, dan Karya Umum > 000. Computer Science, Information and General Works/Ilmu Komputer, Informasi, dan Karya Umum > 006 Special Computer Methods/Metode Komputer Tertentu > 006.7 Multimedia Systems/Sistem-sistem Multimedia > 006.75 Social Multimedia/Multimedia Social > 006.754 Online Social Network/Situs Jejaring Sosial, Sosial Media
500 Natural Science and Mathematics/Ilmu-ilmu Alam dan Matematika > 510 Mathematics/Matematika > 518 Numerical Analysis/Analisis Numerik, Analisa Numerik > 518.1 Algorithms/Algoritma
Divisions: Fakultas Ilmu Komputer > Informatika
Depositing User: khalimah
Date Deposited: 07 Aug 2025 03:51
Last Modified: 07 Aug 2025 03:51
URI: http://repository.mercubuana.ac.id/id/eprint/96638

Actions (login required)

View Item View Item