Classification of Heart Disease Based on Clinical Data Using the K-Nearest Neighbor Method

Abdullah Muhajir; Cendra Harmon

doi:10.31572/inotera.Vol11.Iss1.2026.ID655

Abdullah Muhajir Universitas Pamulang
Cendra Harmon Universitas Pamulang

DOI: https://doi.org/10.31572/inotera.Vol11.Iss1.2026.ID655

Keywords: Heart Disease, Classification, K-Nearest Neighbor, Machine Learning, Clinical Data

Abstract

Heart disease is one of the leading causes of death worldwide; therefore, methods that can support early and accurate diagnosis are urgently needed. This study aims to classify heart disease based on patients’ clinical data using the K-Nearest Neighbor (KNN) method. The dataset used consists of patients’ clinical data, including attributes such as age, gender, blood pressure, cholesterol levels, maximum heart rate, and other medical attributes.The research stages include data preprocessing, transformation of categorical data into numerical form, data normalization using StandardScaler, and data splitting into training and testing sets with a ratio of 80% and 20%, respectively. The classification process is carried out using the K-Nearest Neighbor algorithm with a K value of 7. Model performance evaluation is conducted using a confusion matrix and evaluation metrics including precision, recall, f1-score, and accuracy.The results show that the KNN method is able to classify heart disease with an accuracy rate of 57%. The model demonstrates good performance on the majority class; however, its performance on the minority class remains low due to data imbalance and similarities in characteristics between classes. Therefore, the KNN method can be used as an initial approach for classifying heart disease based on clinical data, although further development is still required to improve model performance

Downloads

Download data is not yet available.

References

K. M. Almustafa, “Prediction of heart disease and classifiers’ sensitivity analysis,” BMC Bioinformatics, vol. 21, no. 1, pp. 1–18, 2020.

M. Anita, I. G. D. Yulianti, and S. V. Pasaribu, “Classification of Heart Disease Risk Factors Using Machine Learning,” HOAQ (High Education Organization Archive Quality Journal of Information Technology), vol. 16, no. 1, pp. 68–78, 2025.

R. Hidayat, Y. S. Sy, T. Sujana, M. Husnah, H. T. Saputra, and F. Okmayura, “Implementation of Machine Learning for Heart Disease Prediction Using the Support Vector Machine Algorithm,” BIOS: Journal of Information Technology and Computer Engineering, vol. 5, no. 2, pp. 161–168, 2024.

M. F. Akbarollah, W. Wiyanto, D. Ardiatma, and A. T. Zy, “Application of the K-Nearest Neighbor Algorithm in Heart Disease Classification,” Journal of Computer System Informatics, vol. 4, no. 4, pp. 850–860, 2023.

A. A. M. Lubis, R. K. Dinata, and H. A. K. Aidilof, “Classification of Heart Disease Using Modified K-Nearest Neighbor (MKNN) Method,” Journal of Advanced Computer Knowledge Algorithms, vol. 1, no. 2, pp. 31–37, 2024.

I. K. A. Sugitha, A. Triayudi, and E. T. E. Handayani, “Classification of Heart Disease Using the K-Nearest Neighbor Algorithm and Logistic Regression,” Jurnal Pilar Nusa Mandiri, vol. 20, no. 2, pp. 183–190, 2024.

Taqwanur and M. B. Suryawantiningtyas, “G-Tech: Jurnal Teknologi Terapan,” G-Tech: Journal of Applied Technology, vol. 6, no. 2, pp. 295–305, 2022.

S. Sutrisno and Jupron, “Analysis of Diabetes Classification Using the Neural Network Algorithm,” bit-Tech, vol. 6, no. 3, pp. 303–310, 2024.

R. A. S. H. B. Putra, “Application of the K-Nearest Neighbor Algorithm in Disease Severity Classification,” Journal of Technology and Information Systems, vol. 10, no. 1, pp. 45–53, 2022.

S. N. Hariono, Nurdin, and L. Rosnita, “Comparison of K-Nearest Neighbors Method and Naïve Bayes Method in Classifying the Quality of Oil Palm Seed Varieties,” Jurnal Inotera, vol. 10, no. 2, pp. 413–425, 2025.