TOPIC DISCOVERY AND CLASSIFICATION COMPARISON ON THE COMMENTS OF INDONESIAN ENTERTAINMENT YOUTUBE CHANNEL VIDEOS USING SMOTE, N-GRAM, AND LDA APPROACHES

LILIANDARI, ANNISA RIZKI (2021) TOPIC DISCOVERY AND CLASSIFICATION COMPARISON ON THE COMMENTS OF INDONESIAN ENTERTAINMENT YOUTUBE CHANNEL VIDEOS USING SMOTE, N-GRAM, AND LDA APPROACHES. S1 thesis, Universitas Mercu Buana Jakarta.

[img] Text (HAL COVER)
01 Cover - ANNISA RIZKI LILIANDARI.pdf

Download (1MB)
[img] Text (BAB I)
02 Bab 1 - ANNISA RIZKI LILIANDARI.pdf
Restricted to Registered users only

Download (867kB)
[img] Text (BAB II)
03 Bab 2 - ANNISA RIZKI LILIANDARI.pdf
Restricted to Registered users only

Download (274kB)
[img] Text (BAB III)
04 Bab 3 - ANNISA RIZKI LILIANDARI.pdf
Restricted to Registered users only

Download (331kB)
[img] Text (BAB IV)
05 Bab 4 - ANNISA RIZKI LILIANDARI.pdf
Restricted to Registered users only

Download (172kB)
[img] Text (BAB V)
06 Bab 5 - ANNISA RIZKI LILIANDARI.pdf
Restricted to Registered users only

Download (981kB)
[img] Text (BAB VI)
07 Bab 6 - ANNISA RIZKI LILIANDARI.pdf
Restricted to Registered users only

Download (31kB)
[img] Text (DAFTAR PUSTAKA)
08 Daftar Pustaka - ANNISA RIZKI LILIANDARI.pdf
Restricted to Registered users only

Download (143kB)
[img] Text (LAMPIRAN)
09 Lampiran - ANNISA RIZKI LILIANDARI.pdf
Restricted to Registered users only

Download (259kB)

Abstract

YouTube is currently the most popular social media platform, with 88% of active users having easy access to it. Comments containing opinions and suggestions are increasing, and have become challenging to be interpreted individually. This research specifies on the data analysis of text classification and topic modeling of YouTube comments, related to entertainment video contents in Indonesia. This was carried out by applying the data mining classification method, to compare the performance of the Multinomial Naïve Bayes, K-Nearest Neighbor, and Support Vector Machine techniques, and also ascertaining the effect of various experiments, in locating the accurate model for classifying text as positive, negative, or neutral comments. However, the topic modeling process uses Latent Dirichlet Allocation. In conclusion, the complete preprocessing, SMOTE technique application, parameter setting, and N-gram advanced features, contribute to improving accuracy. The results showed that the best level of accuracy, was obtained from a model that applied the SMOTE technique, with a proportion of 80% training data, and 20% testing data. Therefore, the SVM + SMOTE model is superior to the MNB + SMOTE and K-NN + SMOTE techniques, with an accuracy of 97.2% (dataset 1), 96.1% (dataset 2), and 96.3% (dataset 3). The topic modeling shows that two of the three datasets, have the same topic in the content presentation. Key words: YouTube commentary, text classification, topic modeling, machine learning, smote, n-gram.

Item Type: Thesis (S1)
NIM/NIDN Creators: 41517010006
Uncontrolled Keywords: YouTube commentary, text classification, topic modeling, machine learning, smote, n-gram.
Subjects: 000 Computer Science, Information and General Works/Ilmu Komputer, Informasi, dan Karya Umum > 000. Computer Science, Information and General Works/Ilmu Komputer, Informasi, dan Karya Umum > 005 Computer Programmming, Programs, Data/Pemprograman Komputer, Program, Data > 005.5 General Purpose Application Programs/Program Aplikasi dengan Kegunaan Khusus
600 Technology/Teknologi > 650 Management, Public Relations, Business and Auxiliary Service/Manajemen, Hubungan Masyarakat, Bisnis dan Ilmu yang Berkaitan > 658 General Management/Manajemen Umum > 658.01-658.09 [Management of Enterprises of Specific Sizes, Scopes, Forms; Data Processing]/[Pengelolaan Usaha dengan Ukuran, Lingkup, Bentuk Tertentu; Pengolahan Data] > 658.05 Data Processing Computer Applications/Pengolahan Data Aplikasi Komputer
Divisions: Fakultas Ilmu Komputer > Informatika
Depositing User: Dede Muksin Lubis
Date Deposited: 08 Oct 2024 03:29
Last Modified: 08 Oct 2024 03:29
URI: http://repository.mercubuana.ac.id/id/eprint/91639

Actions (login required)

View Item View Item