SARJITO, RAMADHANI NUR (2025) PENERAPAN KLASIFIKASI MULTI-LABEL PADA RESEP MASAKAN INDONESIA UNTUK DETEKSI ALERGEN MAKANAN MENGGUNAKAN ALGORITMA K-NEAREST NEIGHBOR, RANDOM FOREST, DAN XGBOOST. S1 thesis, Universitas Mercu Buana Jakarta.
|
Text (HAL COVER)
01 Cover.pdf Download (543kB) | Preview |
|
![]() |
Text (BAB I)
02 Bab 1.pdf Restricted to Registered users only Download (111kB) |
|
![]() |
Text (BAB II)
03 Bab 2.pdf Restricted to Registered users only Download (366kB) |
|
![]() |
Text (BAB III)
04 Bab 3.pdf Restricted to Registered users only Download (753kB) |
|
![]() |
Text (BAB IV)
05 Bab 4.pdf Restricted to Registered users only Download (466kB) |
|
![]() |
Text (BAB V)
06 Bab 5.pdf Restricted to Registered users only Download (33kB) |
|
![]() |
Text (DAFTAR PUSTAKA)
07 Daftar Pustaka.pdf Restricted to Registered users only Download (175kB) |
|
![]() |
Text (LAMPIRAN)
08 Lampiran.pdf Restricted to Registered users only Download (867kB) |
Abstract
Food allergies are a health issue that significantly impacts quality of life and can cause severe, even life-threatening reactions. In Indonesia, approximately 1.6 million children suffer from cow's milk allergy, highlighting the need for a system capable of automatically detecting allergen content. This study aims to develop an allergen detection system for Indonesian food recipes using a multi-label classification approach based on machine learning. A total of 7,840 recipes were collected from the website Cookpad.com using web scraping techniques. Labeling was conducted automatically, based on a keyword list from five main allergen categories: milk, peanuts, eggs, seafood, and wheat. The collected data then underwent preprocessing steps, including text cleaning, removal of punctuation and numbers, lowercasing, tokenization, stopword removal, and stemming, to produce clean and consistent data representation. Next, three machine learning algorithms— K-Nearest Neighbors (KNN), Random Forest (RF), and Extreme Gradient Boosting (XGB)—were applied to build the classification model. Evaluation was performed using metrics such as accuracy, precision, recall, and F1-score. The results showed that XGB, with hyperparameter tuning via GridSearchCV, delivered the best performance, achieving the highest recall value of 0.9794 for milk allergen detection. This system was implemented as a web application using Streamlit, making it easy for users to practically detect allergen content in food recipes. Keywords: Allergen Detection, K-Nearest Neighbor (KNN), Random Forest (RF), Extreme Gradient Boosting (XGB), Indonesian Recipes Alergi makanan merupakan masalah kesehatan yang berdampak signifikan terhadap kualitas hidup dan dapat menyebabkan reaksi serius, bahkan mengancam jiwa. Di Indonesia, sekitar 1,6 juta anak-anak mengalami alergi susu sapi, sehingga dibutuhkan sistem yang mampu mendeteksi kandungan alergen secara otomatis. Penelitian ini bertujuan untuk mengembangkan sistem deteksi alergen pada resep masakan Indonesia menggunakan pendekatan klasifikasi multilabel berbasis machine learning. Sebanyak 7.840 resep dikumpulkan dari situs Cookpad.com melalui teknik web scraping. Proses pelabelan dilakukan secara otomatis, berdasarkan daftar kata kunci dari lima kategori alergen utama yaitu susu, kacang tanah, telur, makanan laut, dan gandum. Data yang diperoleh kemudian diproses melalui tahap preprocessing yang meliputi pembersihan teks, penghapusan tanda baca dan angka, konversi huruf kecil, tokenisasi, penghapusan stopword, serta stemming untuk menghasilkan representasi data yang bersih dan konsisten. Selanjutnya, tiga algoritma machine learning yaitu K-Nearest Neighbors (KNN), Random Forest (RF), dan Extreme Gradient Boosting (XGB) diterapkan untuk membangun model klasifikasi. Evaluasi dilakukan menggunakan metrik akurasi, presisi, recall, dan F1-score. Hasil menunjukkan bahwa XGB dengan tuning hyperparameter melalui GridSearchCV memberikan performa terbaik, dengan nilai recall tertinggi sebesar 0,9794 untuk deteksi alergen susu. Sistem ini diimplementasikan ke dalam aplikasi web berbasis Streamlit, sehingga memudahkan pengguna dalam mendeteksi kandungan alergen pada resep makanan secara praktis. Kata kunci: Deteksi Alergen, K-Nearest Neighbor (KNN), Random Forest (RF), dan Extreme Gradient Boosting (XGB), Resep Masakan Indonesia.
Actions (login required)
![]() |
View Item |