WIJDAN, MUHAMAD ZAKY (2025) PERBANDINGAN ANALISIS SENTIMEN PADA APLIKASI X TERKAIT DIABETES MENGGUNAKAN METODE K-NEAREST NEIGHBORS (KNN) DAN SUPPORT VECTOR MACHINE (SVM). S1 thesis, Universitas Mercu Buana Jakarta.
|
Text (HAL COVER)
01 COVER.pdf Download (458kB) | Preview |
|
![]() |
Text (BAB I)
02 BAB 1.pdf Restricted to Registered users only Download (114kB) |
|
![]() |
Text (BAB II)
03 BAB 2.pdf Restricted to Registered users only Download (237kB) |
|
![]() |
Text (BAB III)
04 BAB 3.pdf Restricted to Registered users only Download (140kB) |
|
![]() |
Text (BAB IV)
05 BAB 4.pdf Restricted to Registered users only Download (1MB) |
|
![]() |
Text (BAB V)
06 BAB 5.pdf Restricted to Registered users only Download (32kB) |
|
![]() |
Text (DAFTAR PUSTAKA)
07 DAFTAR PUSTAKA.pdf Restricted to Registered users only Download (103kB) |
|
![]() |
Text (LAMPIRAN)
08 LAMPIRAN.pdf Restricted to Registered users only Download (971kB) |
Abstract
This study focuses on comparing the performance of two algorithms, K-Nearest Neighbors (KNN) and Support Vector Machine (SVM), in analysing public sentiment regarding diabetes issues. The data used was sourced from the social media platform X (formerly Twitter). Data was collected using crawling techniques with keywords related to diabetes in the time range 2020–2024, resulting in 8,607 Indonesian-language tweets. After undergoing pre-processing steps such as cleaning, case folding, normalisation, tokenisation, and stemming, the data was manually labelled as positive or negative sentiment. Feature extraction was performed using TF-IDF before the data was split into training and testing sets. Both models were then trained and tested to classify tweets based on their sentiment. Evaluation results showed that the SVM algorithm achieved the highest accuracy of 81.69% with balanced precision and recall values for negative sentiment, while KNN achieved an accuracy of 68.31% but performed worse in classifying positive sentiment. These findings indicate that SVM is more effective in handling high-dimensional text data and can be used as a more reliable model in social media-based sentiment analysis, particularly for health-related issues such as diabetes. Kata kunci: Sentiment Analysis, Twitter, Diabetes, K-Nearest Neighbors (KNN), Support Vector Machine (SVM), TF-IDF Studi ini berfokus pada perbandingan kinerja dua algoritma, K-Nearest Neighbors (KNN) dan Support Vector Machine (SVM), dalam menganalisis sentimen publik mengenai isu diabetes. Data yang digunakan berasal dari platform media sosial X (dahulu Twitter). Data dikumpulkan menggunakan teknik crawling dengan kata kunci terkait diabetes dalam rentang waktu 2020–2024, menghasilkan 8.607 tweet berbahasa Indonesia. Setelah melalui proses pra-pemrosesan seperti cleaning, case folding, normalisasi, tokenisasi, dan stemming, data diberi label sentimen secara manual menjadi positif dan negatif. Ekstraksi fitur dilakukan dengan TF-IDF sebelum dilakukan pembagian data menjadi data latih dan uji. Kedua model kemudian dilatih dan diuji untuk mengklasifikasikan tweet berdasarkan sentimennya. Hasil evaluasi menunjukkan bahwa algoritma SVM memberikan akurasi tertinggi sebesar 81,69% dengan nilai precision dan recall yang seimbang pada sentimen negatif, sedangkan KNN memperoleh akurasi sebesar 68,31% namun menunjukkan performa lebih rendah dalam mengklasifikasikan sentimen positif. Temuan ini menunjukkan bahwa SVM lebih unggul dalam menangani data teks berdimensi tinggi dan dapat digunakan sebagai model yang lebih andal dalam analisis sentimen berbasis media sosial, khususnya pada isu-isu kesehatan seperti diabetes. Kata kunci: Analisis Sentimen, Twitter, Diabetes, K-Nearest Neighbors (KNN), Support Vector Machine (SVM), TF-IDF.
Actions (login required)
![]() |
View Item |