Hyperparameter Optimization of TF-IDF and SVM via Grid Search for Sentiment Analysis of Traveloka Customer Reviews

Authors

  • Muhammad Bayu Kurniawan Universitas Amikom Yogyakarta
    Indonesia
  • Hanafi Universitas AMIKOM Yogyakarta
    Indonesia
  • Riki Hikmianto Universitas AMIKOM Yogyakarta
    Indonesia
  • Isnawati Muslihah Institut Seni Indonesia Surakarta
    Indonesia

Keywords:

Sentiment Analysis, Hyperparameter Optimization, Grid Search, Support Vector Machine, TF-IDF, Traveloka

Abstract

Customer reviews on digital platforms are crucial for improving services and making business decisions. This study focuses on automated sentiment analysis for Traveloka, a leading Indonesian online travel application. We propose a systematic hyperparameter optimization of a combined TF-IDF and Support Vector Machine (SVM) pipeline. A dataset of 20,200 user reviews was collected from the Google Play Store. After preprocessing and a two-stage labeling process, the data was split using stratified sampling (70% training, 30% testing). We conducted a comprehensive Grid Search with stratified 5-fold cross-validation to jointly optimize TF-IDF n-gram ranges (unigram, bigram, trigram) and SVM hyperparameters across four kernel types (Linear, RBF, Polynomial, Sigmoid). The results show that the Polynomial kernel with trigram features (C=5, gamma=1, degree=5, coef0=10) performs best. It achieves a test accuracy of 87.10% and a macro F1-score of 86.9%. Error analysis revealed the model's high reliability in detecting negative feedback (precision: 90.4%) but also its difficulty with contrastive sentences and informal language. The minimal performance differences among top configurations suggest the task is robust to specific parameter choices. However, the model's bag-of-ngrams approach shows limitations in processing contrastive sentences and informal language. For future work, employing contextual embeddings (e.g., IndoBERT) and exploring alternative algorithms like Random Forest or Neural Networks could address these challenges. This research presents a thoroughly optimized traditional ML methodology that establishes a strong baseline for automated sentiment analysis of Indonesian user feedback.

Downloads

Download data is not yet available.

Downloads

Submitted

2024-04-25

Accepted

2026-01-26

Published

2026-06-02

How to Cite

Muhammad Bayu Kurniawan, Hanafi, Riki Hikmianto, & Isnawati Muslihah. (2026). Hyperparameter Optimization of TF-IDF and SVM via Grid Search for Sentiment Analysis of Traveloka Customer Reviews . Khazanah Informatika : Jurnal Ilmu Komputer Dan Informatika, 11(2), 10–18. Retrieved from https://journals2.ums.ac.id/khif/article/view/4784