Towards Explainable Graph Spectral Clustering for BERT Embeddings

Authors

Keywords: Explainable Machine Learning, Natural Language Processing, Graph Spectral Clustering, Document Embedding versus Explainability, BERT and GloVe and TVS Embedding

Abstract

Artificial Intelligence algorithms are increasingly applied to tasks in Natural Language Processing, including document clustering. As these algorithms become increasingly complex (such as transformer-based embeddings, like BERT) and/or are of a ``black-box'' nature, such as Graph Spectral Clustering (GSC) algorithms, the demand for explaining the results of such algorithms is becoming increasingly urgent. 
In this paper, we  propose a model-aware method to explain the results of GSC in the context of BERT-based embeddings. 
We present a novel theoretical methodology for explanation, based on the premise that document similarity in GSC is computed as cosine similarity of BERT embeddings of documents. 
We demonstrate the validity of this methodology by presenting strong GSC clustering results, restoring the human-made assignment of hashtags to tweets.  We show that GSC based on BERT embeddings outperforms approaches using Term Vector Space and GloVe embeddings. Therefore, the resulting explanations are also expected to be of higher quality.  

Downloads

Published
25.03.2026
Issue
Section
Articles

How to Cite

Kłopotek, M., Wierzchoń, S. T. ., Starosta, B. ., Borkowski, P. ., & Czerski, D. . (2026). Towards Explainable Graph Spectral Clustering for BERT Embeddings. Journal of Automation, Mobile Robotics and Intelligent Systems, 20(1), 53-65. https://doi.org/10.14313/jamris-2026-005