FASYA, AQILLA RIDZKY ISLAMI (2025) ANALISIS ALGORITMA SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DAN RANDOM FOREST UNTUK KLASIFIKASI GANGGUAN BICARA DYSARTHRIA. S1 thesis, Universitas Mercu Buana Jakarta.
|
Text (HAL COVER)
01 COVER.pdf Download (461kB) | Preview |
|
![]() |
Text (BAB I)
02 BAB 1.pdf Restricted to Registered users only Download (110kB) |
|
![]() |
Text (BAB II)
03 BAB 2.pdf Restricted to Registered users only Download (142kB) |
|
![]() |
Text (BAB III)
04 BAB 3.pdf Restricted to Registered users only Download (152kB) |
|
![]() |
Text (BAB IV)
05 BAB 4.pdf Restricted to Registered users only Download (324kB) |
|
![]() |
Text (BAB V)
06 BAB 5.pdf Restricted to Registered users only Download (81kB) |
|
![]() |
Text (DAFTAR PUSTAKA)
07 DAFTAR PUSTAKA.pdf Restricted to Registered users only Download (96kB) |
|
![]() |
Text (LAMPIRAN)
08 LAMPIRAN.pdf Restricted to Registered users only Download (871kB) |
Abstract
Automatic detection of dysarthric speech is crucial for timely clinical intervention but remains under-explored in Indonesia. This study analyses and compares the performance of three classification algorithms—Support Vector Machine (SVM), Logistic Regression (LR), and Random Forest (RF)—in distinguishing dysarthric from typical speech. The TORGO dataset comprising 8,214 English-language recordings serves as the corpus. All files were resampled to 16 kHz mono, RMS-normalised, and trimmed for silence. Acoustic features were extracted as averaged 13 Mel-Frequency Cepstral Coefficients (MFCC). Data were partitioned speaker-independently (80 % train, 20 % test) to prevent identity leakage. Models were evaluated using accuracy, precision, recall, and F1-score. Results show that RF outperforms the others with 96.74 % accuracy, 0.85 recall for the dysarthria class, and a 0.91 F1-score. SVM achieved 89.86 % accuracy but only 0.59 recall, while LR yielded 88.83 % accuracy with 0.56 recall. These findings confirm that MFCC features combined with an ensemble RF model are the most reliable under class-imbalance conditions, whereas SVM and LR require additional balancing techniques. The study recommends RF as a baseline for developing automatic dysarthria screening systems and encourages future work on advanced acoustic features and Indonesian speech data. Keywords : Dysarthria, Random Forest, Support Vector Machine, Logistic Regression, MFCC Deteksi otomatis gangguan bicara dysarthria penting untuk mempercepat intervensi klinis, tetapi masih jarang diterapkan di Indonesia. Penelitian ini bertujuan menganalisis dan membandingkan kinerja tiga algoritma klasifikasi—Support Vector Machine (SVM), Logistic Regression (LR), dan Random Forest (RF)—dalam membedakan suara penderita dysarthria dan suara normal. Dataset TORGO yang berisi 8214 rekaman wicara berbahasa Inggris digunakan sebagai korpus. Seluruh berkas di-resample ke 16 kHz mono, dinormalisasi, dan heningnya dipangkas. Fitur akustik diekstraksi sebagai rata-rata 13 Mel-Frequency Cepstral Coefficients (MFCC). Data dibagi secara stratified (80 % latih, 20 % uji) pada tingkat penutur untuk mencegah kebocoran identitas. Evaluasi dilakukan menggunakan akurasi, presisi, recall, dan F1-score. Hasil menunjukkan RF unggul dengan akurasi 96,74 %, recall kelas dysarthria 0,85, dan F1-score 0,91. SVM meraih akurasi 89,86 % namun recall 0,59, sedangkan LR mencatat akurasi 88,83 % dan recall 0,56. Temuan ini menegaskan bahwa kombinasi fitur MFCC dan ansambel RF paling andal untuk skenario data tidak seimbang, sedangkan SVM dan LR memerlukan teknik penyeimbangan kelas tambahan. Secara keseluruhan, penelitian merekomendasikan RF sebagai model dasar pengembangan sistem skrining dysarthria otomatis dan mendorong eksplorasi fitur akustik lanjutan serta penerapan pada bahasa Indonesia. Kata Kunci : Dysarthria, Random Forest, Support Vector Machine, Logistic Regression, MFCC
Actions (login required)
![]() |
View Item |