Please use this identifier to cite or link to this item: https://repository.uksw.edu//handle/123456789/23948
Title: Perbandingan Metode k-Nearest Neighbor dan Naive Bayes untuk Klasifikasi Data Single Nucleotide Polymorphism
Authors: Indrajaya, Denny
Keywords: klasifikasi;k-Nearest Neighbor;Naive Bayes;Single Nucleotide Polymorphism
Issue Date: Apr-2022
Abstract: Penelitian ini bertujuan untuk membandingkan akurasi dan F1 metode k-Nearest Neighbor dan metode Naive Bayes dalam mengklasifikasikan data Single Nucleotide Polymorphism (SNP) dari 120 orang yang terbagi menjadi 2 kelompok, yaitu kelompok orang Eropa (CEU) dan orang Yoruba (YRI). Penentuan metode terbaik dilakukan berdasarkan nilai rata-rata akurasi dan rata-rata F1 dari 1000 iterasi dan berbagai macam pembagian persentase training dataset dan testing dataset. Pada penelitian ini, pemilihan lokasi SNP untuk proses klasifikasi juga dilakukan dengan analisis korelasi. Berdasarkan penelitian yang telah dilakukan dengan pembagian data 60% training dataset dan 40% testing dataset, 70% training dataset dan 30% testing dataset, 80% training dataset dan 20% testing dataset, serta 90% training dataset dan 10% testing dataset, diperoleh nilai rata-rata akurasi dari metode k-Nearest Neighbor dengan nilai k = 31 sebesar 98.38% dan nilai rata-rata F1 sebesar 98.39%. Sedangkan pada metode Naive Bayes diperoleh rata-rata akurasi sebesar 96.74% dan rata-rata F1 sebesar 96.63%, Dalam kasus ini, metode pengklasifikasi k-Nearest Neighbor lebih baik daripada metode Naive Bayes dalam klasifikasi data SNP untuk mengetahui asal-usul keturunan seseorang cenderung dari CEU atau YRI.
This research aims to compare the accuracy and F1 of the k-Nearest Neighbor method and the Naive Bayes method in classifying Single Nucleotide Polymorphism (SNP) data from 120 people who are divided into 2 groups, namely European (CEU) and Yoruba (YRI). Determination of the best method is based on the average value of accuracy and average value of F1 from 1000 iterations and various percentage distribution of training datasets dan testing datasets. In this research, the selection of SNP locations for the classification process was also carried out by correlation analysis. Based on research that has been done with the distribution of data 60% training dataset and 40% testing dataset, 70% training dataset and 30% testing dataset, 80% training dataset and 20% testing dataset, also 90% training dataset and 10% testing dataset, the average value of the accuracy of the k-Nearest Neighbor method with a value of k = 31 is 98.38% and the average value of F1 is 98.39%. While the Naive Bayes method obtained an average accuracy of 96.74% and an average F1 of 96.63%. In this case, the k-Nearest Neighbor classification method is better than the Naive Bayes method in classifying SNP data to determine the origin of a person's ancestor tends to be from CEU or YRI.
URI: https://repository.uksw.edu/handle/123456789/23948
Appears in Collections:T1 - Mathematics

Files in This Item:
File Description SizeFormat 
T1_662018003_Judul.pdf943.02 kBAdobe PDFView/Open
T1_662018003_Daftar Pustaka.pdf331.03 kBAdobe PDFView/Open
T1_662018003_Isi.pdf
  Until 2999-01-01
849.49 kBAdobe PDFView/Open Request a copy


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.