Towards Explainable Graph Spectral Clustering for BERT Embeddings

Mieczysław Kłopotek; Sławomir T.  Wierzchoń; Bartłomiej  Starosta; Piotr  Borkowski; Dariusz  Czerski

doi:10.14313/jamris-2026-005

Authors

Mieczysław Kłopotek Institute of Computer Science, Polish Academy of Sciences, Poland
Sławomir T. Wierzchoń Institute of Computer Science, Polish Academy of Sciences, Poland
Bartłomiej Starosta Institute of Computer Science, Polish Academy of Sciences, Poland
Piotr Borkowski Institute of Computer Science, Polish Academy of Sciences, Poland
Dariusz Czerski Institute of Computer Science, Polish Academy of Sciences, Poland

DOI:

https://doi.org/10.14313/jamris-2026-005

Keywords:

Explainable Machine Learning, Natural Language Processing, Graph Spectral Clustering, Document Embedding versus Explainability, BERT and GloVe and TVS Embedding

Abstract

Artificial Intelligence algorithms are increasingly applied to tasks in Natural Language Processing, including document clustering. As these algorithms become increasingly complex (such as transformer-based embeddings, like BERT) and/or are of a ``black-box'' nature, such as Graph Spectral Clustering (GSC) algorithms, the demand for explaining the results of such algorithms is becoming increasingly urgent.
In this paper, we propose a model-aware method to explain the results of GSC in the context of BERT-based embeddings.
We present a novel theoretical methodology for explanation, based on the premise that document similarity in GSC is computed as cosine similarity of BERT embeddings of documents.
We demonstrate the validity of this methodology by presenting strong GSC clustering results, restoring the human-made assignment of hashtags to tweets. We show that GSC based on BERT embeddings outperforms approaches using Term Vector Space and GloVe embeddings. Therefore, the resulting explanations are also expected to be of higher quality.

Towards Explainable Graph Spectral Clustering for BERT Embeddings

Authors

DOI:

Keywords:

Abstract

Downloads

Published

Issue

Section

License

How to Cite

Information