A Study of Clustering Approaches Applied to Customer Reviews in the Digital Era

Authors

  • M.N.S. Tissera Department of Industrial Management, Faculty of Science, University of Kelaniya, Sri Lanka
  • P.P.G.D. Asanka Department of Industrial Management, Faculty of Science, University of Kelaniya, Sri Lanka
  • R.A.C.P. Rajapakse Department of Industrial Management, Faculty of Science, University of Kelaniya, Sri Lanka

DOI:

https://doi.org/10.31357/ait.v4i02.8023

Keywords:

Clustering, Customer Segmentation, Language Model, Marketing, Text Analysis

Abstract

The digital revolution has reshaped the landscape of business transactions, with online platforms generating vast amounts of text data through customer reviews. This paper explores the transformative potential of harnessing this data for customer segmentation, comparing traditional methods such as Term Frequency-Inverse Document Frequency (TF-IDF) and Bag-of-Words (BoW) with state-of-the-art Large Language Models (LLMs) for sentence embeddings. The primary objective is to identify the most effective approach for customer segmentation based on textual data by conducting a comprehensive analysis using clustering approaches. The study investigates the impact of LLMs, specifically BERT, RoBERTa, XLNet, and MPNet, in contrast to TF-IDF and BoW. Through experimentation and evaluation metrics, including the Silhouette Score, Davies-Bouldin Index, and Calinski-Harabasz Index, the research sheds light on the nuanced effectiveness of each method. While LLMs, particularly RoBERTa, demonstrate superior clustering performance, the study acknowledges the subtle impact of spelling correction on these models. The findings provide valuable insights for businesses seeking to understand customer sentiments and preferences, enabling more targeted and personalized strategies in the dynamic digital age. This research contributes to the evolving field of customer analytics by offering a comparative analysis of clustering approaches, laying the foundation for future advancements in text-based customer segmentation.

 

Downloads

Published

2025-03-05

How to Cite

M.N.S. Tissera, P.P.G.D. Asanka, & R.A.C.P. Rajapakse. (2025). A Study of Clustering Approaches Applied to Customer Reviews in the Digital Era. Advances in Technology, 4(02). https://doi.org/10.31357/ait.v4i02.8023

Issue

Section

Articles

Categories