REIZA, MUHAMMAD (2025) PENERAPAN ALGORITMA NAIVE BAYES UNTUK DETEKSI SPAM PADA EMAIL. S1 thesis, Universitas Mercu Buana Jakarta.
|
Text (HAL COVER)
01 COVER.pdf Download (434kB) | Preview |
|
![]() |
Text (BAB I)
02 BAB 1.pdf Restricted to Registered users only Download (40kB) |
|
![]() |
Text (BAB II)
03 BAB 2.pdf Restricted to Registered users only Download (179kB) |
|
![]() |
Text (BAB III)
04 BAB 3.pdf Restricted to Registered users only Download (71kB) |
|
![]() |
Text (BAB IV)
05 BAB 4.pdf Restricted to Registered users only Download (148kB) |
|
![]() |
Text (BAB V)
06 BAB 5.pdf Restricted to Registered users only Download (40kB) |
|
![]() |
Text (DAFTAR PUSTAKA)
07 DAFTAR PUSTAKA.pdf Restricted to Registered users only Download (155kB) |
|
![]() |
Text (LAMPIRAN)
08 LAMPIRAN.pdf Restricted to Registered users only Download (440kB) |
Abstract
Spam email detection has become a crucial issue in the digital age. With the high volume of emails I receive daily, I often encounter unwanted or potentially dangerous messages, such as phishing attempts or malware dissemination. This situation underscores the urgency of effective and reliable spam detection systems. Therefore, I directed this research towards implementing the Naive Bayes Algorithm as a classification method to automatically detect spam in emails. In this research methodology, I began by collecting a labeled email dataset (spam/nonspam). The data then underwent a series of preprocessing and essential feature extraction steps. I applied CountVectorizer to transform email text into a numerical representation in the form of a word frequency matrix, which served as input for the Naive Bayes Algorithm. I trained this model to calculate word probabilities and determine email categories. I performed the training and testing processes by proportionally splitting the dataset. Evaluation results show that the Naive Bayes Algorithm achieved very good performance, evidenced by an accuracy of 0.9856 (98.56%), precision of 0.9567 (95.67%), recall of 0.9412 (94.12%), and an F1- Score of 0.9489 (94.89%). Although I found 5 false positives (legitimate emails misclassified) and 10 false negatives (spam leaked) in the confusion matrix, these figures indicate the model's high effectiveness in identifying spam emails. These findings provide valuable insights for spam filter developers, and I hope they will serve as a reference for further studies to enhance spam detection effectiveness. Keywords: Spam Detection, Email, Naive Bayes Algorithm, Text Classification, Machine Learning. Deteksi spam pada email telah menjadi isu krusial di era digital. Dengan tingginya volume email yang saya terima setiap hari, saya sering berhadapan dengan pesan tidak diinginkan atau berpotensi berbahaya, seperti upaya phishing atau penyebaran malware. Kondisi ini menggarisbawahi urgensi sistem deteksi spam yang efektif dan andal. Oleh karena itu, penelitian ini saya tujukan untuk menerapkan Algoritma Naive Bayes sebagai metode klasifikasi guna otomatis mendeteksi spam dalam email.Dalam metodologi penelitian ini, saya mulai dengan mengumpulkan dataset email terlabel (spam/non-spam). Data kemudian melalui serangkaian proses pra-pemrosesan dan ekstraksi fitur penting. Saya mengaplikasikan CountVectorizer untuk mengubah teks email menjadi representasi numerik berupa matriks frekuensi kata, yang menjadi input bagi Algoritma Naive Bayes. Model ini saya latih untuk menghitung probabilitas kata dan menentukan kategori email. Proses pelatihan dan pengujian saya lakukan dengan membagi dataset secara proporsional.Hasil evaluasi menunjukkan bahwa Algoritma Naive Bayes mencapai kinerja sangat baik, dibuktikan dengan akurasi 0.9856 (98.56%), presisi 0.9567 (95.67%), recall 0.9412 (94.12%), dan F1-Score 0.9489 (94.89%). Meskipun saya menemukan 5 false positives (email sah yang salah klasifikasi) dan 10 false negatives (spam yang lolos) dalam confusion matrix, angka ini menunjukkan efektivitas tinggi model dalam mengidentifikasi email spam. Temuan ini memberikan wawasan berharga bagi pengembang sistem filter spam dan saya harapkan menjadi referensi studi lanjutan guna meningkatkan efektivitas deteksi spam. Kata kunci: Deteksi Spam, Email, Algoritma Naive Bayes, Klasifikasi Teks, Pembelajaran Mesin.
Actions (login required)
![]() |
View Item |