Analisa Komparasi Kinerja Algoritma K-Nearest Neighbor (K-NN) dan Decision Tree dalam Klasifikasi Situs Web Phising: Penelitian

Fajar Dwi Prasetyo; Muhammad Maulana; Faris Ramadhan; Ananda Lutfi Setiabudi; Imam Budiawan; Desmulyati Desmulyati

doi:10.31004/jerkin.v4i3.4965

Authors

Fajar Dwi Prasetyo Universitas Bina Sarana Informatika
Muhammad Maulana Universitas Bina Sarana Informatika
Faris Ramadhan Universitas Bina Sarana Informatika
Ananda Lutfi Setiabudi Universitas Bina Sarana Informatika
Imam Budiawan Universitas Bina Sarana Informatika
Desmulyati Desmulyati Universitas Bina Sarana Informatika

DOI:

https://doi.org/10.31004/jerkin.v4i3.4965

Keywords:

Machine Learning, Phishing Detection, K-Nearest Neighbor, Decision Tree, Cybersecurity, URL Classification

Abstract

Phishing attacks represent a significant cybersecurity threat aimed at stealing sensitive user information through psychological manipulation using fake websites. Conventional detection methods relying on blacklists are considered ineffective in recognizing zero-day attacks or newly published phishing sites. This study aims to develop an automated detection model using a Machine Learning approach by comparing the performance of two Supervised Learning algorithms: K-Nearest Neighbor (K-NN) and Decision Tree. The dataset used is sourced from the UCI Machine Learning Repository, consisting of 11,055 records with 30 URL characteristic features. Performance evaluation was conducted using Accuracy metrics and Confusion Matrix analysis. Experimental results indicate that the Decision Tree algorithm significantly outperforms K-NN with an accuracy of 95.21%, while K-NN achieved an accuracy of only 60.11%. Furthermore, Decision Tree demonstrated a very low False Negative rate, making it a more recommended model for real-time cybersecurity system implementation.

References

A. P. Author, "Understanding Phishing Attacks: A Comprehensive Review," Journal of Cyber Security, vol. 12, no. 4, pp. 45-50, 2023.

B. Santoso, "Kelemahan Metode Blacklist pada Sistem Keamanan Web," Jurnal Informatika Indonesia, vol. 8, no. 1, 2022.

R. Ramianto and D. Kusuma, "Analisis Komparasi Algoritma SVM dan Naive Bayes untuk Deteksi Phishing," Jurnal Teknologi Informasi, vol. 5, no. 2, pp. 100-112, 2022.

D. Dua and C. Graff, "UCI Machine Learning Repository: Phishing Websites Data Set," University of California, Irvine, School of Information and Computer Sciences, 2019. [Online]. Available: http://archive.ics.uci.edu/ml/datasets/Phishing+Websites.

Scikit-learn Developers, "Scikit-learn: Machine Learning in Python," Journal of Machine Learning Research, vol. 12, pp. 2825-2830, 2011.

B. Srivastava and P. K. Singh, "Optimal Detection of Phishing Attack using SCA based K-NN," Procedia Computer Science, vol. 218, pp. 2450-2459, 2023. [Online]. Available: https://doi.org/10.1016/j.procs.2023.01.220

A. K. Jain and B. B. Gupta, "Comparative evaluation of machine learning algorithms for phishing detection," PeerJ Computer Science, vol. 9, p. e1373, 2023. [Online]. Available: https://doi.org/10.7717/peerj-cs.1373

M. A. Al-Shareeda, M. A. Alazzawi, S. Manickam, and A. H. H. Al-naji, "Improved Phishing Attack Detection with Machine Learning," Applied Sciences, vol. 13, no. 13, p. 7822, 2023. [Online]. Available: https://doi.org/10.3390/app13137822

B. Srinivas, K. V. Swamy, and B. E. Reddy, "Improving the phishing website detection using empirical analysis of FT and its variants," Heliyon, vol. 9, no. 8, p. e18676, 2023. [Online]. Available: https://doi.org/10.1016/j.heliyon.2023.e18676

S. Alhumoud, "Machine Learning Approach for Email Phishing Detection," Procedia Computer Science, vol. 220, pp. 793-798, 2023. [Online]. Available: https://doi.org/10.1016/j.procs.2023.03.106