Desarrollo e implementación de sistemas de recomendación

Aldana Contreras, Laura Camila

Desarrollo e implementación de sistemas de recomendación

dc.contributor.advisor	Masmela Caita, Luis Alejandro
dc.contributor.author	Aldana Contreras, Laura Camila
dc.date.accessioned	2025-04-02T15:13:56Z
dc.date.available	2025-04-02T15:13:56Z
dc.date.created	2025-01-31
dc.description	Este proyecto se desarrolló en colaboración con Compensar, una de las principales entidades de bienestar y caja de compensación en Colombia, con el objetivo de optimizar la selección de productos para eventos mediante un sistema de recomendación basado en contenido. Para ello, se implementa-ron técnicas avanzadas de procesamiento de lenguaje natural (PLN), utilizando el modelo TF-IDF (Term Frequency-Inverse Document Frequency) para extraer y analizar atributos clave de los pro-ductos, tales como su categoría, nombre y el evento sugerido. A partir del análisis de palabras clave y similitudes textuales, el sistema identifica los productos más relevantes para cada evento. Adicionalmente, se incorporó un modelo de Naive Bayes Multinomial para la categorización de productos, lo que permitió una organización más eficiente y precisa de los mismos. Este modelo, entrenado con datos previamente etiquetados, mejora la alineación entre los productos y los distintos tipos de eventos, facilitando la toma de decisiones en la planificación y logística. El objetivo principal del sistema es proporcionar recomendaciones personalizadas y precisas, mejo-rando la experiencia del usuario y optimizando la gestión de eventos en Compensar. La validación del sistema se llevó a cabo mediante métricas clave, como la exactitud en la clasificación y la retro-alimentación de los usuarios, lo que garantiza su eficacia y escalabilidad en distintos escenarios de datos.
dc.description.abstract	This project was developed in collaboration with Compensar, one of the leading welfare and compensation fund entities in Colombia, with the aim of optimizing product selection for events through a content-based recommendation system. To achieve this, advanced natural language processing (NLP) techniques were implemented, utilizing the TF-IDF (Term Frequency-Inverse Document Frequency) model to extract and analyze key product attributes, such as category, name, and suggested event. Based on keyword analysis and textual similarities, the system identifies the most relevant products for each event. Additionally, a Multinomial Naive Bayes model was incorporated for product categorization, enabling a more efficient and precise organization of items. This model, trained with previously labeled data, improves the alignment between products and different event types, facilitating decision-making in planning and logistics. The primary objective of the system is to provide personalized and accurate recommendations, enhancing the user experience and optimizing event management at Compensar. The system was validated using key metrics, such as classification accuracy and user feedback, ensuring its effectiveness and scalability across different data scenarios.
dc.format.mimetype	pdf
dc.identifier.uri	http://hdl.handle.net/11349/94501
dc.language.iso	spa
dc.publisher	Universidad Distrital Francisco José de Caldas
dc.relation.references	@book{manning2008introduction, author = {Manning, Christopher and Raghavan, Prabhakar and Schütze, Hinrich}, title = {Introduction to Information Retrieval}, year = {2008}, publisher = {Cambridge University Press} }
dc.relation.references	@book{geron2019hands, author = {Géron, Aurélien}, title = {Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow}, year = {2019}, publisher = {O'Reilly Media} }
dc.relation.references	@incollection{nystrom2001naive, author = {Nyström, Thomas and Salakoski, Tapio}, title = {Naive Bayes Classifier}, booktitle = {Handbook of Computational Statistics}, pages = {169--174}, publisher = {Springer}, year = {2001} }
dc.relation.references	@book{python, author = {{Python Software Foundation}}, title = {Python 3 Documentation}, year = {2023}, url = {https://docs.python.org/3/}, note = {Accessed: 2023-12-31} }
dc.relation.references	@book{pandas2023, author = {The Pandas Development Team}, title = {pandas: powerful Python data analysis toolkit}, year = {2023}, url = {https://pandas.pydata.org/} }
dc.relation.references	@article{scikit-learn, author = {Pedregosa, Fabian and Varoquaux, Gaël and Gramfort, Alexandre and Michel, Vincent and Thirion, Bertrand and Grisel, Olivier and Blondel, Mathieu and Prettenhofer, Peter and Weiss, Ron and Dubourg, Vincent and Vanderplas, Jake and Passos, Alexandre and Cournapeau, David and Brucher, Matthieu and Perrot, Matthieu and Duchesnay, Edouard}, title = {Scikit-learn: Machine Learning in Python}, journal = {Journal of Machine Learning Research}, volume = {12}, pages = {2825--2830}, year = {2011}, url = {https://scikit-learn.org/} }
dc.relation.references	@book{nltk, author = {Bird, Steven and Klein, Ewan and Loper, Edward}, title = {Natural Language Processing with Python}, publisher = {O'Reilly Media, Inc.}, year = {2009}, url = {https://www.nltk.org/} }
dc.relation.references	@article{imblearn, author = {Lemaitre, Guillaume and Nogueira, Fernando and Aridas, Christos K.}, title = {Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning}, journal = {Journal of Machine Learning Research}, volume = {18}, number = {17}, pages = {1-5}, year = {2017}, url = {https://imbalanced-learn.org/} }
dc.relation.references	@book{matplotlib, author = {Hunter, John D.}, title = {Matplotlib: A 2D Graphics Environment}, journal = {Computing in Science \& Engineering}, volume = {9}, number = {3}, pages = {90-95}, year = {2007}, doi = {10.1109/MCSE.2007.55} }
dc.relation.references	@article{numpy, author = {Harris, Charles R. and Millman, K. Jarrod and van der Walt, Stéfan J. and Gommers, Ralf and Virtanen, Pauli and Cournapeau, David and Wieser, Eric and Taylor, Julian and Berg, Sebastian and Smith, Nathaniel J. and Kern, Robert and Picus, Matti and Hoyer, Stephan and van Kerkwijk, Marten H. and Brett, Matthew and Haldane, Allan and del Río, Joaquín F. and Wiebe, Mark and Peterson, Pearu and Gérard-Marchant, Pierre and Sheppard, Kevin and Reddy, Tyler and Weckesser, Warren and Abbasi, Hameer and Gohlke, Christoph and Oliphant, Travis E.}, title = {Array programming with NumPy}, journal = {Nature}, volume = {585}, pages = {357–362}, year = {2020}, doi = {10.1038/s41586-020-2649-2} }
dc.rights.acceso	Abierto (Texto Completo)
dc.rights.accessrights	OpenAccess
dc.subject	Sistemas de recomendación
dc.subject	TF-IDF
dc.subject	Procesamiento de lenguaje natural
dc.subject	Naive bayes multinomial
dc.subject	Categorización de productos
dc.subject	Personalización
dc.subject	Eventos
dc.subject.keyword	Recommendation systems
dc.subject.keyword	TF-IDF
dc.subject.keyword	Natural language processing
dc.subject.keyword	Multinomial naive bayes
dc.subject.keyword	Product categorization
dc.subject.keyword	Personalization
dc.subject.keyword	Events
dc.subject.lemb	Matemáticas -- Tesis y Disertaciones Académicas
dc.subject.lemb	Administración de la producción -- Modelos matemáticos
dc.subject.lemb	Administración Industrial -- Modelos matemáticos
dc.subject.lemb	Control de inventarios -- Modelos matemáticos
dc.title	Desarrollo e implementación de sistemas de recomendación
dc.title.titleenglish	Development and implementation of recommendation systems
dc.type	bachelorThesis
dc.type.coar	http://purl.org/coar/resource_type/c_7a1f
dc.type.degree	Pasantía
dc.type.driver	info:eu-repo/semantics/bachelorThesis

Archivos

Bloque original

Mostrando 1 - 2 de 2

Nombre:: AldanaContrerasLauraCamila2025.pdf
Tamaño:: 860.31 KB
Formato:: Adobe Portable Document Format

Descargar

Nombre:: Licencia de uso y publicacion .docx
Tamaño:: 271.46 KB
Formato:: Microsoft Word XML

Descargar

Bloque de licencias

Mostrando 1 - 1 de 1

Nombre:: license.txt
Tamaño:: 7 KB
Formato:: Item-specific license agreed upon to submission
Descripción:

Descargar

Colecciones

Matemáticas