TWEETS ANALYSIS OF HOAX DETECTION USING LEVENSHTEIN DISTANCE AND TEXT CLASSIFICATION METHODS

PUSPITAJUDIN, YEPPY MANGUN (2022) TWEETS ANALYSIS OF HOAX DETECTION USING LEVENSHTEIN DISTANCE AND TEXT CLASSIFICATION METHODS. S1 thesis, Universitas Mercu Buana Jakarta.

[img]
Preview
Text (HAL COVER)
01. Cover.pdf

Download (953kB) | Preview
[img]
Preview
Text (ABSTRAK)
02. Abstrak.pdf

Download (144kB) | Preview
[img] Text (BAB I)
03. Bab 1.pdf
Restricted to Registered users only

Download (675kB)
[img] Text (BAB II)
04. Bab 2.pdf
Restricted to Registered users only

Download (179kB)
[img] Text (BAB III)
05. Bab 3.pdf
Restricted to Registered users only

Download (1MB)
[img] Text (BAB IV)
06. Bab 4.pdf
Restricted to Registered users only

Download (507kB)
[img] Text (BAB V)
07. Bab 5.pdf
Restricted to Registered users only

Download (328kB)
[img] Text (BAB VI)
08. Bab 6.pdf
Restricted to Registered users only

Download (922kB)
[img] Text (DAFTAR PUSTAKA)
09. Daftar Pustaka.pdf
Restricted to Registered users only

Download (603kB)
[img] Text (LAMPIRAN)
10. Lampiran.pdf
Restricted to Registered users only

Download (286kB)

Abstract

Hoax misled human perceptions with wrong information but considered truth which will cause anxiety or discord in society. Especially if there are trends or events, such as presidential elections, pandemics, political issues, and others. The Ministry of Communication and Informatics (Kominfo) in Indonesia noted that up to August 8, 2020, 1,028 hoaxes were spread on various social media platforms and according to DailySocial.id research, 44% of Indonesians have not been able to detect hoax news. In this research, the author raises the topic of Machine Learning which focuses on detecting the spread of false information (hoaxes) on Tweets by implementing the Levenshtein Distance (LD) method with the combination of Text Classification such as Support Vector Machine (SVM), and Stochastic Gradient Descent (SGD). TF-IDF calculations will be used to measure the weight of a word from the tweets which is used as a vocabulary and use the LD method to measure the number of differences that are owned in each text being processed, also SVM and SGD to perform text classification so that the final result is a percentage whether the text considered as a hoax or not. These raise 3 topics to focuses on detecting hoaxes related to the covid19 pandemic, politics, and economics in Indonesia. Keywords: Hoax, Tweet Analysis, Hoax Detection, Levenshtein Distance, Text Classification. Hoaks memanipulasi persepsi manusia dengan memberikan informasi yang salah, tetapi dianggap sebagai kebenaran yang akan menimbulkan keresahan atau perselisihan di antara masyarakat. Apalagi jika ada sebuah tren, peristiwa ataupun permasalahan, seperti pemilihan presiden, pandemi, isu politik dan lain-lain. Kementerian Komunikasi dan Informatika (Kominfo) di Indonesia mencatat hingga 8 Agustus 2020, 1.028 hoaks tersebar di berbagai platform media sosial dan menurut riset DailySocial.id, 44% masyarakat Indonesia belum bisa mendeteksi berita hoaks. Pada penelitian ini penulis mengangkat topik Machine Learning yang berfokus pada pendeteksian penyebaran informasi palsu (hoaks) pada Twitter dengan mengimplementasikan metode Levenshtein Distance (LD) dengan kombinasi Text Classification seperti Support Vector Machine (SVM), dan Stochastic Gradient Descent (SGD). Perhitungan TF-IDF akan digunakan untuk mengukur bobot sebuah kata dari tweet yang digunakan sebagai kosakata dan menggunakan metode LD untuk mengukur jumlah perbedaan yang dimiliki pada setiap teks yang diproses, serta SVM dan SGD untuk melakukan klasifikasi teks sehingga hasil akhir berupa persentase apakah teks tersebut dianggap hoax atau tidak. Penelitian ini mengangkat 3 topik khusus untuk mendeteksi hoaks yaitu, pandemi covid19, politik, dan ekonomi di Indonesia. Kata Kunci: Hoaks, Tweet Analysis, Hoax Detection, Levenshtein Distance, Text Classification.

Item Type: Thesis (S1)
Call Number CD: FIK/INFO. 22 181
Call Number: SIK/15/23/005
NIM/NIDN Creators: 41518010056
Uncontrolled Keywords: Hoaks, Tweet Analysis, Hoax Detection, Levenshtein Distance, Text Classification.
Subjects: 000 Computer Science, Information and General Works/Ilmu Komputer, Informasi, dan Karya Umum > 000. Computer Science, Information and General Works/Ilmu Komputer, Informasi, dan Karya Umum > 001 Knowledge/Ilmu Pengetahuan > 001.9 Controversial Knowledge/Pengetahuan Kontroversial > 001.95 Deceptions and Hoaxes/Penipuan dan Hoax
000 Computer Science, Information and General Works/Ilmu Komputer, Informasi, dan Karya Umum > 000. Computer Science, Information and General Works/Ilmu Komputer, Informasi, dan Karya Umum > 004 Data Processing, Computer Science/Pemrosesan Data, Ilmu Komputer, Teknik Informatika
100 Philosophy and Psychology/Filsafat dan Psikologi > 150 Psychology/Psikologi > 154 Subconscious and Altered States and Process/Psikologi Bawah Sadar > 154.6 Sleep Phenomena/Fenomena Tidur > 154.63 Dreams/Mimpi > 154.634 Analysis/Analisis
Divisions: Fakultas Ilmu Komputer > Informatika
Depositing User: ADELINA HASNA SETIAWATI
Date Deposited: 18 Jan 2023 02:10
Last Modified: 18 Jan 2023 02:10
URI: http://repository.mercubuana.ac.id/id/eprint/73501

Actions (login required)

View Item View Item