DIANSHAH, DWIKA ERFA (2025) OPTIMALISASI INDOBERT UNTUK ANALISIS SENTIMEN STUDI KASUS: APLIKASI MYPERTAMINA DI PLAY STORE. S1 thesis, Universitas Mercu Buana Jakarta.
|
Text (HAL COVER)
01 COVER.pdf Download (258kB) | Preview |
|
![]() |
Text (BAB I)
02 BAB 1.pdf Restricted to Registered users only Download (47kB) |
|
![]() |
Text (BAB II)
03 BAB 2.pdf Restricted to Registered users only Download (176kB) |
|
![]() |
Text (BAB III)
04 BAB 3.pdf Restricted to Registered users only Download (115kB) |
|
![]() |
Text (BAB IV)
05 BAB 4.pdf Restricted to Registered users only Download (1MB) |
|
![]() |
Text (BAB V)
06 BAB 5.pdf Restricted to Registered users only Download (100kB) |
|
![]() |
Text (DAFTRA PUSTAKA)
07 DAFTAR PUSTAKA.pdf Restricted to Registered users only Download (161kB) |
|
![]() |
Text (LAMPIRAN)
08 LAMPIRAN.pdf Restricted to Registered users only Download (841kB) |
Abstract
This study aims to classify sentiment in user reviews of the MyPertamina application using the IndoBERT model with a semi-supervised learning approach. A total of 2,355 reviews were collected via web scraping and preprocessed, with 1,000 manually labeled samples used to determine the optimal training size. The best performance was achieved using 800 labeled samples, reaching an F1-score of 98.10%, while fine-tuning for 10 epochs resulted in 98.75% accuracy. The model was then used for automatic labeling, yielding a high confidence rate (99.1%) and a dominant negative sentiment (77.5%). To address the class imbalance, random oversampling was applied, and the re-trained model achieved a balanced F1-score of 0.98 across both sentiment classes. These results demonstrate that IndoBERT is highly effective and reliable for sentiment analysis in Indonesian-language text. Keywords: Sentiment Analysis, IndoBERT, Automatic Labeling, MyPertamina, Fine-Tuning, Classification, Text Mining, NLP Penelitian ini bertujuan untuk melakukan klasifikasi sentimen pada ulasan pengguna aplikasi MyPertamina menggunakan model IndoBERT dengan pendekatan semi-supervised learning. Sebanyak 2355 ulasan dikumpulkan melalui web scraping dan diproses melalui tahapan preprocessing, kemudian 1.000 data dilabel secara manual untuk menentukan jumlah data optimal dalam pelatihan model. Hasil terbaik diperoleh pada 800 data berlabel dengan F1-score 98,10%, dan fine-tuning model selama 10 epoch menghasilkan akurasi 98,75%. Model ini digunakan untuk melakukan labeling otomatis dengan confidence tinggi (99,1%) dan distribusi sentimen yang didominasi label negatif (77,5%). Untuk mengatasi ketidakseimbangan, diterapkan random oversampling sehingga model akhir mencapai F1-score 0,98 dan performa prediksi yang seimbang terhadap kedua kelas. Hasil ini membuktikan bahwa IndoBERT sangat efektif dan andal untuk analisis sentimen teks Bahasa Indonesia. Kata kunci: Analisis Sentimen, IndoBERT, Labeling Otomatis, MyPertamina, Fine-Tuning, Klasifikasi, Text Mining, NLP
Actions (login required)
![]() |
View Item |