Desarrollo e implementación de sistemas de recomendación

dc.contributor.advisorMasmela Caita, Luis Alejandro
dc.contributor.authorAldana Contreras, Laura Camila
dc.date.accessioned2025-04-02T15:13:56Z
dc.date.available2025-04-02T15:13:56Z
dc.date.created2025-01-31
dc.descriptionEste proyecto se desarrolló en colaboración con Compensar, una de las principales entidades de bienestar y caja de compensación en Colombia, con el objetivo de optimizar la selección de productos para eventos mediante un sistema de recomendación basado en contenido. Para ello, se implementa-ron técnicas avanzadas de procesamiento de lenguaje natural (PLN), utilizando el modelo TF-IDF (Term Frequency-Inverse Document Frequency) para extraer y analizar atributos clave de los pro-ductos, tales como su categoría, nombre y el evento sugerido. A partir del análisis de palabras clave y similitudes textuales, el sistema identifica los productos más relevantes para cada evento. Adicionalmente, se incorporó un modelo de Naive Bayes Multinomial para la categorización de productos, lo que permitió una organización más eficiente y precisa de los mismos. Este modelo, entrenado con datos previamente etiquetados, mejora la alineación entre los productos y los distintos tipos de eventos, facilitando la toma de decisiones en la planificación y logística. El objetivo principal del sistema es proporcionar recomendaciones personalizadas y precisas, mejo-rando la experiencia del usuario y optimizando la gestión de eventos en Compensar. La validación del sistema se llevó a cabo mediante métricas clave, como la exactitud en la clasificación y la retro-alimentación de los usuarios, lo que garantiza su eficacia y escalabilidad en distintos escenarios de datos.
dc.description.abstractThis project was developed in collaboration with Compensar, one of the leading welfare and compensation fund entities in Colombia, with the aim of optimizing product selection for events through a content-based recommendation system. To achieve this, advanced natural language processing (NLP) techniques were implemented, utilizing the TF-IDF (Term Frequency-Inverse Document Frequency) model to extract and analyze key product attributes, such as category, name, and suggested event. Based on keyword analysis and textual similarities, the system identifies the most relevant products for each event. Additionally, a Multinomial Naive Bayes model was incorporated for product categorization, enabling a more efficient and precise organization of items. This model, trained with previously labeled data, improves the alignment between products and different event types, facilitating decision-making in planning and logistics. The primary objective of the system is to provide personalized and accurate recommendations, enhancing the user experience and optimizing event management at Compensar. The system was validated using key metrics, such as classification accuracy and user feedback, ensuring its effectiveness and scalability across different data scenarios.
dc.format.mimetypepdf
dc.identifier.urihttp://hdl.handle.net/11349/94501
dc.language.isospa
dc.publisherUniversidad Distrital Francisco José de Caldas
dc.relation.references@book{manning2008introduction, author = {Manning, Christopher and Raghavan, Prabhakar and Schütze, Hinrich}, title = {Introduction to Information Retrieval}, year = {2008}, publisher = {Cambridge University Press} }
dc.relation.references@book{geron2019hands, author = {Géron, Aurélien}, title = {Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow}, year = {2019}, publisher = {O'Reilly Media} }
dc.relation.references@incollection{nystrom2001naive, author = {Nyström, Thomas and Salakoski, Tapio}, title = {Naive Bayes Classifier}, booktitle = {Handbook of Computational Statistics}, pages = {169--174}, publisher = {Springer}, year = {2001} }
dc.relation.references@book{python, author = {{Python Software Foundation}}, title = {Python 3 Documentation}, year = {2023}, url = {https://docs.python.org/3/}, note = {Accessed: 2023-12-31} }
dc.relation.references@book{pandas2023, author = {The Pandas Development Team}, title = {pandas: powerful Python data analysis toolkit}, year = {2023}, url = {https://pandas.pydata.org/} }
dc.relation.references@article{scikit-learn, author = {Pedregosa, Fabian and Varoquaux, Gaël and Gramfort, Alexandre and Michel, Vincent and Thirion, Bertrand and Grisel, Olivier and Blondel, Mathieu and Prettenhofer, Peter and Weiss, Ron and Dubourg, Vincent and Vanderplas, Jake and Passos, Alexandre and Cournapeau, David and Brucher, Matthieu and Perrot, Matthieu and Duchesnay, Edouard}, title = {Scikit-learn: Machine Learning in Python}, journal = {Journal of Machine Learning Research}, volume = {12}, pages = {2825--2830}, year = {2011}, url = {https://scikit-learn.org/} }
dc.relation.references@book{nltk, author = {Bird, Steven and Klein, Ewan and Loper, Edward}, title = {Natural Language Processing with Python}, publisher = {O'Reilly Media, Inc.}, year = {2009}, url = {https://www.nltk.org/} }
dc.relation.references@article{imblearn, author = {Lemaitre, Guillaume and Nogueira, Fernando and Aridas, Christos K.}, title = {Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning}, journal = {Journal of Machine Learning Research}, volume = {18}, number = {17}, pages = {1-5}, year = {2017}, url = {https://imbalanced-learn.org/} }
dc.relation.references@book{matplotlib, author = {Hunter, John D.}, title = {Matplotlib: A 2D Graphics Environment}, journal = {Computing in Science \& Engineering}, volume = {9}, number = {3}, pages = {90-95}, year = {2007}, doi = {10.1109/MCSE.2007.55} }
dc.relation.references@article{numpy, author = {Harris, Charles R. and Millman, K. Jarrod and van der Walt, Stéfan J. and Gommers, Ralf and Virtanen, Pauli and Cournapeau, David and Wieser, Eric and Taylor, Julian and Berg, Sebastian and Smith, Nathaniel J. and Kern, Robert and Picus, Matti and Hoyer, Stephan and van Kerkwijk, Marten H. and Brett, Matthew and Haldane, Allan and del Río, Joaquín F. and Wiebe, Mark and Peterson, Pearu and Gérard-Marchant, Pierre and Sheppard, Kevin and Reddy, Tyler and Weckesser, Warren and Abbasi, Hameer and Gohlke, Christoph and Oliphant, Travis E.}, title = {Array programming with NumPy}, journal = {Nature}, volume = {585}, pages = {357–362}, year = {2020}, doi = {10.1038/s41586-020-2649-2} }
dc.rights.accesoAbierto (Texto Completo)
dc.rights.accessrightsOpenAccess
dc.subjectSistemas de recomendación
dc.subjectTF-IDF
dc.subjectProcesamiento de lenguaje natural
dc.subjectNaive bayes multinomial
dc.subjectCategorización de productos
dc.subjectPersonalización
dc.subjectEventos
dc.subject.keywordRecommendation systems
dc.subject.keywordTF-IDF
dc.subject.keywordNatural language processing
dc.subject.keywordMultinomial naive bayes
dc.subject.keywordProduct categorization
dc.subject.keywordPersonalization
dc.subject.keywordEvents
dc.subject.lembMatemáticas -- Tesis y Disertaciones Académicas
dc.subject.lembAdministración de la producción -- Modelos matemáticos
dc.subject.lembAdministración Industrial -- Modelos matemáticos
dc.subject.lembControl de inventarios -- Modelos matemáticos
dc.titleDesarrollo e implementación de sistemas de recomendación
dc.title.titleenglishDevelopment and implementation of recommendation systems
dc.typebachelorThesis
dc.type.coarhttp://purl.org/coar/resource_type/c_7a1f
dc.type.degreePasantía
dc.type.driverinfo:eu-repo/semantics/bachelorThesis

Archivos

Bloque original

Mostrando 1 - 2 de 2
Cargando...
Miniatura
Nombre:
AldanaContrerasLauraCamila2025.pdf
Tamaño:
860.31 KB
Formato:
Adobe Portable Document Format
No hay miniatura disponible
Nombre:
Licencia de uso y publicacion .docx
Tamaño:
271.46 KB
Formato:
Microsoft Word XML

Bloque de licencias

Mostrando 1 - 1 de 1
No hay miniatura disponible
Nombre:
license.txt
Tamaño:
7 KB
Formato:
Item-specific license agreed upon to submission
Descripción:

Colecciones