AKBAR, MUHAMMAD FADHLILLAH RAISUL (2023) ANALISIS KINERJA MODEL ALGORITMA RNNOISE SEBAGAI SPEECH ENHANCEMENT PADA KONFERENSI VIDEO. S1 thesis, Universitas Mercu Buana Jakarta.
Text (COVER)
01 Cover.pdf Download (489kB) |
|
Text (ABSTRAK)
02 Abstrak.pdf Download (169kB) |
|
Text (BAB I)
03 Bab 1.pdf Restricted to Registered users only Download (329kB) |
|
Text (BAB II)
04 Bab 2.pdf Restricted to Registered users only Download (541kB) |
|
Text (BAB III)
05 Bab 3.pdf Restricted to Registered users only Download (374kB) |
|
Text (BAB IV)
06 Bab 4.pdf Restricted to Registered users only Download (496kB) |
|
Text (BAB V)
07 Bab 5.pdf Restricted to Registered users only Download (310kB) |
|
Text (DAFTAR PUSTAKA)
08 Daftar Pustaka.pdf Restricted to Registered users only Download (335kB) |
|
Text (LAMPIRAN)
09 Lampiran.pdf Restricted to Registered users only Download (562kB) |
Abstract
After several years, the country of Indonesia and all countries in the world have experienced the corona virus pandemic, causing various daily human activities to be disrupted, starting from work, the economy, education, and health. The impact of the incident above that many experienced was the completion of activities between humans so that when carrying out work or fulfilling group meeting obligations, they were required to do so boldly using video conferencing software. The use of Deep Learning techniques in speech enhancement has led to the creation of the Denoising system which combines speech signal processing with Deep Learning. In this paper, the enrichment efforts on the implementation of noise reduction algorithms with the RNNnoise model to maximize the intensity of the human voice. research has focused on keeping the sound intensity of noise as low as possible and increasing the sound intensity of better, higher-quality speech. Deep learning with the RNNoise algorithm is used to enhance sound when video conferencing from audio files. This research has the final result, namely the RNNoise algorithm model can perform speech enhancement so that it can be used in video conferencing with low complexity. The results of the training model can be obtained for a training time model of 21 minutes. The training model has a number of epochs of 120. The final loss value is between a maximum value of 0.0014 to a minimum of 0.000446118 with an average of 0.000457586. Furthermore, from the results of the training value of Val Loss is between the maximum value of 0.0040 to a minimum of 0.0047 with an average of 0.004320833. With audio test results outside the dataset, the average PESQ value is 1,059. Keyword: Deep Learning, RNNoise, Speech Enhancement. Setelah beberapa tahun negara Indonesia dan seluruh negara di dunia telah mengalami masa pandemic virus corona sehingga menyebabkan berbagai aktivitas keseharian manusia terganggu mulai dari pekerjaan, perekonomian, pendidikan, dan kesehatan. Dampak dari kejadian diatas yang banyak dialami yaitu dilakukannya pembatasan terhadap aktivitas antara manusia sehingga dalam melakukan pekerjaan atau memenuhi kewajiban pertemuan kelompok diwajibkan untuk melakukannya secara daring menggunakan perangkat lunak konverensi video. Penggunaan teknik Deep Learning dalam speech enhancement telah menyebabkan terciptanya sistem Denoising yang menggabungkan pemrosesan sinyal suara dengan Deep Learning. Dalam makalah ini, difokuskan upaya pada implementasi algoritma pengurangan kebisingan dengan model RNNnoise untuk memaksimalkan intensitas suara manusia. penelitian telah berfokus untuk meredam intensitas suara kebisingan serendah mungkin dan meningkatkan intensitas suara ucapan yang lebih baik dan berkualitas tinggi. Deep learning dengan algoritma RNNoise digunakan untuk meningkatkan suara saat koferensi video dari file audio. Penelitian ini memiliki hasil akhir yaitu model algoritma RNNoise dapat melakukan speech enhancement untuk dapat digunakan pada konferensi video dengan komplesitas yang rendah. Hasil dari pelatihan model dapat diperoleh model waktu pelatihan selama 21 menit. Pelatihan model memiliki banyaknya epoch sebesar 120 nilai akhir Loss berada diantara nilai maksimum 0,0014 sampai dengan minimal 0,000446118 dengan rata-rata 0,000457586. Selanjutnya dari hasil pelatihan nilai dari Val Loss berada diantara nilai maksimum 0,0040 sampai dengan minimal 0,0047 dengan rata-rata 0,004320833. Dengan hasil uji audio diluar dataset menghasilkan nilai PESQ rata-rata sebesar 1,059. Kata Kunci : Deep Learning, RNNoise, Speech Enhancement.
Item Type: | Thesis (S1) |
---|---|
Call Number CD: | FT/ELK. 23 092 |
NIM/NIDN Creators: | 41421120062 |
Uncontrolled Keywords: | Deep Learning, RNNoise, Speech Enhancement. |
Subjects: | 100 Philosophy and Psychology/Filsafat dan Psikologi > 150 Psychology/Psikologi > 154 Subconscious and Altered States and Process/Psikologi Bawah Sadar > 154.6 Sleep Phenomena/Fenomena Tidur > 154.63 Dreams/Mimpi > 154.634 Analysis/Analisis 600 Technology/Teknologi > 620 Engineering and Applied Operations/Ilmu Teknik dan operasi Terapan > 621 Applied Physics/Fisika terapan > 621.3 Electrical Engineering, Lighting, Superconductivity, Magnetic Engineering, Applied Optics, Paraphotic Technology, Electronics Communications Engineering, Computers/Teknik Elektro, Pencahayaan, Superkonduktivitas, Teknik Magnetik, Optik Terapan, Tekn |
Divisions: | Fakultas Teknik > Teknik Elektro |
Depositing User: | CALVIN PRASETYO |
Date Deposited: | 07 Sep 2023 06:52 |
Last Modified: | 07 Sep 2023 06:52 |
URI: | http://repository.mercubuana.ac.id/id/eprint/80501 |
Actions (login required)
View Item |