Skip to main navigation menu Skip to main content Skip to site footer

Articles

Vol. 8 No. 6 (2025): Kohesi: Jurnal Sains dan Teknologi

ANALISIS SENTIMEN ULASAN PENGGUNA CORETAX DI APLIKASI TWITTER MENGGUNAKAN ALGORITMA NAIVE BAYES DAN SUPPORT VECTOR MACHINE

DOI:
https://doi.org/10.2238/6s04zv53
Submitted
June 17, 2025
Published
2025-06-17

Abstract

This study aims to evaluate user sentiment towards the latest tax application, Coretax, and compare the effectiveness of two classification algorithms, namely Support Vector Machine (SVM) and Naive Bayes. The data was collected using web scraping techniques from the X platform (formerly Twitter) and through a series of pre-processing processes, including data cleansing, case folding, normalization, stopwords, stemming, labeling, and visualization. After that, feature weighting was carried out using the Term Frequency-Inverse Document Frequency (TF-IDF) method, as well as feature selection using the SelectKBest and Chi-Square methods.

The results of the evaluation showed that both algorithms performed very well in classifying sentiment. SVM recorded an accuracy of 92%, with precision, recall, and F1-scores of 93%, 94%, and 94%, respectively, for negative sentiment, and 90%, 88%, and 89% for positive sentiment. Meanwhile, the Naive Bayes algorithm also showed similar performance with 92% accuracy, 90% precision, 99% recall, and 94% F1-score for negative sentiment, as well as 97% precision, 80% recall, and 88% F1-score for positive sentiment. These results indicate that both algorithms are quite reliable at recognizing negative sentiments, although there is still room for improvement in detecting positive sentiment.

 

Keywords: Coretax, Sentiment Analysis, Support Vector Machine, Naive Bayes 

ABSTRAK

Penelitian ini bertujuan untuk mengidentifikasi persepsi atau sentimen pengguna terhadap aplikasi perpajakan terbaru, yaitu Coretax, serta membandingkan performa kedua metode klasifikasi, yakni Support Vector Machine (SVM) dan Naive Bayes. Data diperoleh dengan teknik web scraping dari platform X (dahulu dikenal sebagai Twitter), kemudian diolah melalui serangkaian tahap pra-pemrosesan yang mencakup pembersihan teks (cleaning), perubahan huruf menjadi format seragam (case folding), normalisasi (normalized), penghapusan kata-kata umum (stopword), proses stemming (stemming), pelabelan data (labeling), dan visualisasi (visualized). Setelah semua langkah itu, dilakukan pembobotan kata menggunakan metode Term Frequency-Inverse Document Frequency (TF-IDF) dan pemilihan fitur menggunakan pendekatan SelectKBest serta pengujian Chi-Square.

Dari hasil pengujian, kedua algoritma menunjukkan hasil yang sangat baik dalam mengklasifikasikan data sentimen. Algoritma SVM mencapai akurasi sebesar 92%, dengan nilai precision, recall, dan F1-score masing-masing adalah 93%, 94%, dan 94% untuk sentimen negatif. Sementara itu, untuk sentimen positif, angkanya adalah 90%, 88%, dan 89%. Di sisi lain, algoritma Naive Bayes juga mencatat akurasi sebesar 92%, dengan precision 90%, recall 99%, dan F1-score 94% pada sentimen negatif, serta precision 97%, recall 80%, dan F1-score 88% untuk sentimen positif. Temuan ini menunjukkan bahwa kedua metode efektif dalam mengidentifikasi sentimen negatif, tetapi klasifikasi untuk sentimen positif masih perlu perbaikan dalam akurasinya.

Kata Kunci:  Analisis Sentimen, Coretax, Support Vector Machine, Naive Bayes

Similar Articles

1-10 of 218

You may also start an advanced similarity search for this article.