Identification of Keywords for Legal Documents Categories using SOM

Authors

Keywords: Document classification, RoBERTa, NLP, 20 GloVe, NER, SOM

Abstract

This study aims to use the decision-making process in categorizing legal documents by identifying keywords characterizing each legal domain class. The study utilizes the Kohonen Self-Organizing Map method and the Global Vectors for Word Representation (GloVe) model to create an efficient document classification system. As a result, a satisfactory classification accuracy of 71.69% was achieved. The article also discusses alternative approaches implemented to improve classification accuracy, such as the use of Named Entity Recognizer (NER) tools and the RoBERTa model, along with a comparison of their effectiveness. Challenges related to the uneven distribution of categories in the dataset are also mentioned, and potential directions for further research to enhance the classification results of legal documents are presented. 

Downloads

Published
26.03.2025
Issue
Section
Articles

How to Cite

Puchalska, P., Krzemiński, K., Lis, M., Scherer , R. ., Drozda, P., Komar-Komarowski, K., Szałapak, K., Sobecki, A., Zymkowski, T., & Szymański, J. (2025). Identification of Keywords for Legal Documents Categories using SOM. Journal of Automation, Mobile Robotics and Intelligent Systems, 19(1), 33-41. https://doi.org/10.14313/jamris-2025-004