Modelado de temas y análisis de sentimientos utilizando inteligencia artificial
Fecha
Autores
Autor corporativo
Título de la revista
ISSN de la revista
Título del volumen
Editor
Compartir
Altmetric
Resumen
This document describes the internship project carried out at Inversiones Gutiérrez García, is a company that focuses on developing tools based on artificial intelligence, such as computer vision and natural language processing (NLP). A large portion of its customers are service stations (EDS). They need NLP applications that allow them to know in a recursive and automated way the perception of customers towards their service and their positioning as a company compared to competitors. Due to the needs described above, we proceed to develop models whose purpose is the classification of the polarity of comments and the identification of themes, specifically in texts in Spanish and in a particular and specific "language", such as the one used in the context of service stations. It was possible to train two models for different tasks, the first is the sentiment analysis model that allows classifying documents according to the polarity of the comment (negative, neutral and positive). The second is the Named Entity Recognition (NER) model, which automatically detects important entities, allowing information to be structured. In addition to the trained models, the BERTopic and KeyBert models were used for topic extraction in a set of documents and keyword extraction in a document. Finally, six applications are proposed for browsing the data obtained and the simple use of the models. Four of these applications allow the four models implemented in the project to be used intuitively and easily (Sentiment Analysis, NER model, BERTopic and KeyBERT). The fifth application allows you to navigate geographically by service stations and their associated comments, which generates a perception of the positioning of the EDS against the competition. The sixth application allows viewing comment embeddings in 2 dimensions, which is key to understanding the classification and separation of topics by documents.
