PERBANDINGAN TEORITIS DAN EKSPERIMEN ALGORITMA K-MEANS DAN K-MEDOIDS DALAM KLASTERISASI DATA

Main Article Content

Arya Pratama Putra
Jihan Tshivana
Elkin Rilvani

Abstract

This study presents a comparative analysis of two popular clustering algorithms, K-Means and K-Medoids, focusing on clustering quality, computational efficiency, and robustness to outliers. Using the Wine Dataset, which contains various chemical properties of wines, we evaluated the performance of both algorithms with multiple evaluation metrics, including Silhouette Score, Calinski-Harabasz Score, Davies-Bouldin Score, and processing time. The results indicate that K-Means outperforms K-Medoids in computational efficiency, with faster execution times and higher Silhouette and Calinski-Harabasz scores. However, K-Medoids demonstrates greater robustness to outliers and noise, producing more stable clustering results. This study suggests that K-Means is more suitable for large, relatively clean datasets, while K-Medoids is recommended for datasets with significant noise or outliers. The findings provide valuable insights for selecting the optimal clustering algorithm based on data characteristics and application requirements.


Penelitian ini menyajikan analisis perbandingan antara dua algoritma clustering yang populer, K-Means dan K-Medoids, dengan fokus pada kualitas clustering, efisiensi komputasi, dan ketahanan terhadap outlier. Menggunakan Wine Dataset, yang berisi berbagai sifat kimia dari anggur, kami mengevaluasi kinerja kedua algoritma dengan menggunakan beberapa metrik evaluasi, termasuk Silhouette Score, Calinski-Harabasz Score, Davies-Bouldin Score, dan waktu pemrosesan. Hasil penelitian menunjukkan bahwa K-Means unggul dalam hal efisiensi komputasi, dengan waktu eksekusi yang lebih cepat dan skor Silhouette serta Calinski-Harabasz yang lebih tinggi dibandingkan K-Medoids. Namun, K-Medoids menunjukkan ketahanan yang lebih baik terhadap outlier dan noise, menghasilkan hasil clustering yang lebih stabil. Penelitian ini menyarankan bahwa K-Means lebih cocok digunakan untuk dataset besar yang relatif bersih, sementara K-Medoids lebih disarankan untuk dataset yang mengandung banyak noise atau outlier. Penelitian ini memberikan wawasan yang berharga dalam memilih algoritma clustering yang optimal berdasarkan karakteristik data dan kebutuhan aplikasi.

Article Details

Section

Articles

How to Cite

PERBANDINGAN TEORITIS DAN EKSPERIMEN ALGORITMA K-MEANS DAN K-MEDOIDS DALAM KLASTERISASI DATA. (2025). Kohesi: Jurnal Sains Dan Teknologi, 10(2), 61-70. https://doi.org/10.2238/0b2z5035

References

[1] A. Jain and R. Dubes, Algorithms for Clustering Data, Prentice-Hall, 1988.

[2] P.-N. Tan, M. Steinbach, and V. Kumar, Introduction to Data Mining, 2nd ed., Pearson, 2018.

[3] S. Sharma, “Performance analysis of K-Means clustering algorithm on different datasets,” International Journal of Computer Applications, vol. 160, no. 5, pp. 18–24, 2017.

[4] M. Schubert, A. Zimek, and H.-P. Kriegel, “Local outlier detection reconsidered: a generalized view on unsupervised outlier detection,” Data Mining and Knowledge Discovery, vol. 28, no. 2, pp. 190–237, 2014.

[5] F. Pedregosa et al., “Scikit-learn: Machine Learning in Python,” Journal of Machine Learning Research, vol. 12, pp. 2825–2830, 2011.

[6] P. Rousseeuw, “Silhouettes: A graphical aid to the interpretation and validation of cluster analysis,” Journal of Computational and Applied Mathematics, vol. 20, pp. 53–65, 1987.

[7] M. Ester et al., “A density-based algorithm for discovering clusters in large spatial databases with noise,” in Proc. 2nd Int. Conf. Knowledge Discovery and Data Mining, 1996, pp. 226–231.

Most read articles by the same author(s)

Similar Articles

You may also start an advanced similarity search for this article.