FIRANI, BERTA (2025) ANALISIS SEGMENTASI DAN KLASIFIKASI PENJUALAN PRODUK MARKETPLACE MENGGUNAKAN METODE K-MEANS CLUSTERING DAN K-NEAREST NEIGHBOR (STUDI KASUS: SHOPEE). S1 thesis, Universitas Mercu Buana Jakarta.
|
Text (HAL COVER)
01 COVER.pdf Download (535kB) | Preview |
|
![]() |
Text (BAB I)
02 BAB 1.pdf Restricted to Registered users only Download (232kB) |
|
![]() |
Text (BAB II)
03 BAB 2.pdf Restricted to Registered users only Download (406kB) |
|
![]() |
Text (BAB III)
04 BAB 3.pdf Restricted to Registered users only Download (286kB) |
|
![]() |
Text (BAB IV)
05 BAB 4.pdf Restricted to Registered users only Download (1MB) |
|
![]() |
Text (BAB V)
06 BAB 5.pdf Restricted to Registered users only Download (199kB) |
|
![]() |
Text (DAFTAR PUSTAKA)
07 DAFTAR PUSTAKA.pdf Restricted to Registered users only Download (210kB) |
|
![]() |
Text (LAMPIRAN)
08 LAMPIRAN.pdf Restricted to Registered users only Download (1MB) |
Abstract
Advances in information technology have changed the pattern of buying and selling transactions from traditional methods to online businesses through marketplace platforms such as Shopee. However, many novice sellers face challenges in determining potential products to sell due to the lack of data-based insights. To overcome this problem, this study proposes a approach that combines the K-Means Clustering and K-Nearest Neighbor (KNN) algorithms for segmentation and prediction of product sales levels in a more targeted and comprehensive manner. Product data on Shopee is collected through a web scraping process, with attributes including product name, price, category, city, number of sales, and rating. The KMeans method is used to group products into several clusters, then three main clusters are taken labeled as "Highly in demand", "Quite in demand", and "Less in demand", and validated using the Davies-Bouldin Index (DBI). The best DBI value is found at k = 3 for each category, all showing values below 1.0, indicating a good and valid clustering structure. Next, the KNN algorithm is applied to products in the result of cluster to predict sales levels into two categories: "High Sales" and "Low Sales", with validation using the Confusion Matrix.The results show that this approach is effective in generating in-depth insights into product performance on Shopee. The K-Means model is able to form clusters that are representative of consumer interest patterns, while KNN shows high classification performance with an accuracy of up to 96%, a precision of 96%, and a recall of 100% in certain categories. This approach not only provides a basis for data-based decision making for sellers, but also contributes to the development of academic literature in the field of marketplace analysis using data mining techniques. Keywords: Shopee, Data Mining, K-Means Clustering, K-Nearest Neighbor, Product Segmentation, Sales Prediction Kemajuan teknologi informasi telah mengubah pola transaksi jual beli dari metode tradisional ke arah bisnis daring melalui platform marketplace seperti Shopee. Namun, banyak penjual pemula menghadapi tantangan dalam menentukan produk yang potensial untuk dijual karena kurangnya wawasan berbasis data. Untuk mengatasi permasalahan ini, penelitian ini mengusulkan pendekatan yang mengombinasikan algoritma K-Means Clustering dan K-Nearest Neighbor (KNN) untuk segmentasi dan prediksi tingkat penjualan produk secara lebih terarah dan komprehensif. Data produk di Shopee dikumpulkan melalui proses web scraping, dengan atribut meliputi nama produk, harga, kategori, kota, jumlah terjual, dan rating. Metode KMeans digunakan untuk mengelompokkan produk ke dalam beberapa klaster, kemudian diambil tiga klaster utama yang dilabeli sebagai "Sangat diminati", "Cukup diminati", dan "Kurang diminati", dan divalidasi menggunakan DaviesBouldin Index (DBI). Nilai DBI terbaik ditemukan pada k = 3 untuk setiap kategori, seluruhnya menunjukkan nilai di bawah 1.0, yang menandakan struktur klasterisasi yang baik dan valid. Selanjutnya, algoritma KNN diaplikasikan pada produk dalam hasil klasterisasi untuk memprediksi tingkat penjualan ke dalam dua kategori: "Penjualan Tinggi" dan "Penjualan Rendah", dengan validasi menggunakan Confusion Matrix. Hasil penelitian menunjukkan bahwa metode ini efektif dalam menghasilkan wawasan yang mendalam mengenai performa produk di Shopee. Model K-Means mampu membentuk klaster yang representatif terhadap pola minat konsumen, sementara KNN menunjukkan performa klasifikasi yang tinggi dengan akurasi hingga 96%, precision 96%, dan recall 100% pada kategori tertentu. Pendekatan ini tidak hanya memberikan dasar pengambilan keputusan berbasis data bagi penjual, tetapi juga memberikan kontribusi terhadap pengembangan literatur akademik di bidang analisis marketplace menggunakan teknik data mining. Kata Kunci: Shopee, Data Mining, K-Means Clustering, K-Nearest Neighbor, Segmentasi Produk, Prediksi Penjualan
Actions (login required)
![]() |
View Item |