ANALISIS ALGORITMA SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DAN RANDOM FOREST UNTUK KLASIFIKASI GANGGUAN BICARA DYSARTHRIA

FASYA, AQILLA RIDZKY ISLAMI (2025) ANALISIS ALGORITMA SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DAN RANDOM FOREST UNTUK KLASIFIKASI GANGGUAN BICARA DYSARTHRIA. S1 thesis, Universitas Mercu Buana Jakarta.

[img]
Preview
Text (HAL COVER)
01 COVER.pdf

Download (461kB) | Preview
[img] Text (BAB I)
02 BAB 1.pdf
Restricted to Registered users only

Download (110kB)
[img] Text (BAB II)
03 BAB 2.pdf
Restricted to Registered users only

Download (142kB)
[img] Text (BAB III)
04 BAB 3.pdf
Restricted to Registered users only

Download (152kB)
[img] Text (BAB IV)
05 BAB 4.pdf
Restricted to Registered users only

Download (324kB)
[img] Text (BAB V)
06 BAB 5.pdf
Restricted to Registered users only

Download (81kB)
[img] Text (DAFTAR PUSTAKA)
07 DAFTAR PUSTAKA.pdf
Restricted to Registered users only

Download (96kB)
[img] Text (LAMPIRAN)
08 LAMPIRAN.pdf
Restricted to Registered users only

Download (871kB)

Abstract

Automatic detection of dysarthric speech is crucial for timely clinical intervention but remains under-explored in Indonesia. This study analyses and compares the performance of three classification algorithms—Support Vector Machine (SVM), Logistic Regression (LR), and Random Forest (RF)—in distinguishing dysarthric from typical speech. The TORGO dataset comprising 8,214 English-language recordings serves as the corpus. All files were resampled to 16 kHz mono, RMS-normalised, and trimmed for silence. Acoustic features were extracted as averaged 13 Mel-Frequency Cepstral Coefficients (MFCC). Data were partitioned speaker-independently (80 % train, 20 % test) to prevent identity leakage. Models were evaluated using accuracy, precision, recall, and F1-score. Results show that RF outperforms the others with 96.74 % accuracy, 0.85 recall for the dysarthria class, and a 0.91 F1-score. SVM achieved 89.86 % accuracy but only 0.59 recall, while LR yielded 88.83 % accuracy with 0.56 recall. These findings confirm that MFCC features combined with an ensemble RF model are the most reliable under class-imbalance conditions, whereas SVM and LR require additional balancing techniques. The study recommends RF as a baseline for developing automatic dysarthria screening systems and encourages future work on advanced acoustic features and Indonesian speech data. Keywords : Dysarthria, Random Forest, Support Vector Machine, Logistic Regression, MFCC Deteksi otomatis gangguan bicara dysarthria penting untuk mempercepat intervensi klinis, tetapi masih jarang diterapkan di Indonesia. Penelitian ini bertujuan menganalisis dan membandingkan kinerja tiga algoritma klasifikasi—Support Vector Machine (SVM), Logistic Regression (LR), dan Random Forest (RF)—dalam membedakan suara penderita dysarthria dan suara normal. Dataset TORGO yang berisi 8214 rekaman wicara berbahasa Inggris digunakan sebagai korpus. Seluruh berkas di-resample ke 16 kHz mono, dinormalisasi, dan heningnya dipangkas. Fitur akustik diekstraksi sebagai rata-rata 13 Mel-Frequency Cepstral Coefficients (MFCC). Data dibagi secara stratified (80 % latih, 20 % uji) pada tingkat penutur untuk mencegah kebocoran identitas. Evaluasi dilakukan menggunakan akurasi, presisi, recall, dan F1-score. Hasil menunjukkan RF unggul dengan akurasi 96,74 %, recall kelas dysarthria 0,85, dan F1-score 0,91. SVM meraih akurasi 89,86 % namun recall 0,59, sedangkan LR mencatat akurasi 88,83 % dan recall 0,56. Temuan ini menegaskan bahwa kombinasi fitur MFCC dan ansambel RF paling andal untuk skenario data tidak seimbang, sedangkan SVM dan LR memerlukan teknik penyeimbangan kelas tambahan. Secara keseluruhan, penelitian merekomendasikan RF sebagai model dasar pengembangan sistem skrining dysarthria otomatis dan mendorong eksplorasi fitur akustik lanjutan serta penerapan pada bahasa Indonesia. Kata Kunci : Dysarthria, Random Forest, Support Vector Machine, Logistic Regression, MFCC

Item Type: Thesis (S1)
Call Number CD: FIK/INFO. 25 091
NIM/NIDN Creators: 41521010015
Uncontrolled Keywords: Dysarthria, Random Forest, Support Vector Machine, Logistic Regression, MFCC
Subjects: 000 Computer Science, Information and General Works/Ilmu Komputer, Informasi, dan Karya Umum > 000. Computer Science, Information and General Works/Ilmu Komputer, Informasi, dan Karya Umum > 004 Data Processing, Computer Science/Pemrosesan Data, Ilmu Komputer, Teknik Informatika
000 Computer Science, Information and General Works/Ilmu Komputer, Informasi, dan Karya Umum > 020 Library and Information Sciences/Perpustakaan dan Ilmu Informasi > 025 Operations, Archives, Information Centers/Operasional Perpustakaan, Arsip dan Pusat Informasi, Pelayanan dan Pengelolaan Perpustakaan > 025.3 Bibliographic Analysis and Control/Bibliografi Analisis dan Kontrol Perpustakaan > 025.34 Cataloging, Classification, Indexing of Special Materials/Pengatalogan, Klasifikasi, Pengindeksan Bahan Tertentu > 025.344 Machine-Readable Materials/Bahan yang Dapat Dibaca Mesin
000 Computer Science, Information and General Works/Ilmu Komputer, Informasi, dan Karya Umum > 020 Library and Information Sciences/Perpustakaan dan Ilmu Informasi > 025 Operations, Archives, Information Centers/Operasional Perpustakaan, Arsip dan Pusat Informasi, Pelayanan dan Pengelolaan Perpustakaan > 025.4 Subject Analysis and Control/Subjek Analisis dan Kontrol Perpustakaan > 025.46 Classification of Specific Subject/Klasifikasi Khusus
500 Natural Science and Mathematics/Ilmu-ilmu Alam dan Matematika > 510 Mathematics/Matematika > 518 Numerical Analysis/Analisis Numerik, Analisa Numerik > 518.1 Algorithms/Algoritma
Divisions: Fakultas Ilmu Komputer > Informatika
Depositing User: khalimah
Date Deposited: 02 Aug 2025 02:26
Last Modified: 02 Aug 2025 02:26
URI: http://repository.mercubuana.ac.id/id/eprint/96457

Actions (login required)

View Item View Item