Prediction of Presidential Election Results using Sentiment Analysis with Pre and Post Candidate Registration Data

Authors

  • Asno Azzawagama Firdaus Universitas Ahmad Dahlan
    Indonesia
  • Anton Yudhana Universitas Ahmad Dahlan
    Indonesia
  • Imam Riadi Universitas Ahmad Dahlan
    Indonesia

DOI:

https://doi.org/10.23917/khif.v10i1.4836

Keywords:

Indonesia, Naïve Bayes, President, Sentiment Analysis, Support Vector Machine, Twitter

Abstract

Social-media is a solution for politicians as a campaign tool because it can save costs compared to conventional campaigns. The 2024 Indonesian presidential election has attracted public attention, especially among social media users. Twitter, as one of the most widely used social media platforms in Indonesia, has become an effective campaign platform. Sentiment analysis is one approach that can be used to measure public opinion on Indonesian presidential candidates based on Twitter data. The data was collected before the declaration of candidates in March 2023 and shortly after the registration of presidential and vice-presidential candidates in November 2023. The data obtained amounted to 15,000 in March 2023 collection and 11,569 in November 2023 collection and used manual labeling by linguists. After removing duplicated tweets, the data changed to 10,569 data with each candidate having 3,523 data for March 2023 and 4,893 data, with each candidate pair having 1,631 data for November 2023. The sentiment analysis classification model is determined using the Naïve Bayes and Support Vector Machine (SVM) methods with Term Frequency-Inverse Document Frequency (TF-IDF) feature extraction. Based on the data, the highest percentage of positive sentiment for the data obtained in March 2023 is for Ganjar Pranowo data by 77.94% and the highest percentage of negative sentiment is for Anies Baswedan data by 31.39%. Meanwhile, for the data obtained in November 2023, the highest positive sentiment was obtained for the candidate pair Ganjar Pranowo - Mahfud MD by 69.16%, and the highest negative sentiment was found in the data Prabowo Subianto - Gibran Rakabuming Raka by 52.12%. Words that frequently appeared in the positive sentiment for Ganjar Pranowo - Mahfud MD included "strong", "corruption", "support", "appreciation", and others. This research achieved the highest accuracy for SVM method which is 86% and Naive Bayes method which is 79%.

Downloads

Submitted

2024-04-30

Published

2024-04-30

Issue

Section

Articles