Implementación de redes tipo transformer en la selección estratégica de perfiles laborales a nivel empresarial

dc.contributor.advisorFerro Escobar, Roberto
dc.contributor.authorTovar Sánchez, Juan Sebastián
dc.contributor.authorCastro Castellanos, Cristian Camilo
dc.contributor.orcidFerro Escobar, Roberto [0000-0002-8978-538X]
dc.date.accessioned2025-03-16T20:39:30Z
dc.date.available2025-03-16T20:39:30Z
dc.date.created2024-08-13
dc.descriptionEn este proyecto se desarrolla la implementación de un modelo RAG (Retrieval aumented Generation), encaminado a su aplicación en el contexto del reclutamiento y la selección de personal (limitado a las áreas relacionadas a Ingeniería Electrónica), para ello se tiene como punto de partida la obtención de una base de datos documental (conformada por archivos tipo PDF), pasando por una fase de preprocesamiento basada en limpieza de texto y tokenizacion, para posteriormente convertirse en una base de datos vectorizada. Los datos son preparados para el entrenamiento del modelo mediante operaciones de chuking e indexing, permitiendo en consecuencia la inclusión de un LLM (Large Language Model) basado en un modelo transformer, el cual, junto a mecanismos de búsqueda vectorial y aprendizaje por similitud, permiten la generación de lenguaje y la recuperación de información respectivamente. Es así como al hacer un proceso de integración de cada una de las partes se conforma el RAG, con base a ello se pretende encontrar los mejores parámetros de acuerdo a las condiciones dadas, evaluando el rendimiento obtenido en cada caso, en busca del mejor resultado.
dc.description.abstractIn this project, a RAG (Retrieval-Augmented Generation) model is developed for application in the context of recruitment and personnel selection (limited to areas related to Electronic Engineering). The starting point is the creation of a document database (composed of PDF files), followed by a preprocessing phase based on text cleaning and tokenization, which is then converted into a vectorized database. The data is prepared for model training through chunking and indexing operations, enabling the inclusion of a Large Language Model (LLM) based on a transformer model. This model, along with vector search mechanisms and similarity learning, allows for language generation and information retrieval, respectively. By integrating each of these components, the RAG model is constructed. The aim is to find the best parameters according to the given conditions, evaluating the performance obtained in each case to achieve the best result.
dc.format.mimetypepdf
dc.identifier.urihttp://hdl.handle.net/11349/93707
dc.language.isospa
dc.publisherUniversidad Distrital Francisco José de Caldas
dc.relation.references[Nvidia,2024]¿Qué Es un Modelo Transformer? | Blog de NVIDIA. (n.d.). Retrieved April 20, 2024, from 1
dc.relation.references[LangChain,2024] ChatGPT Over Your Data. (n.d.). Retrieved April 21, 2024, from https://blog.langchain.dev/tutorial-chatgpt-over-your-data/
dc.relation.references[Lewis,2024] Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., Küttler, H., Lewis, M., Yih, W.-T., Rocktäschel, T., Riedel, S., & Kiela, D. (n.d.). Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. Retrieved April 21, 2024, from https://github.com/huggingface/transformers/blob/master/
dc.relation.references[Nvidia,2024] What Is Retrieval-Augmented Generation aka RAG | NVIDIA Blogs. (n.d.). Retrieved April 21, 2024, from https://blogs.nvidia.com/blog/what-is-retrieval-augmented generation/
dc.relation.references[Pasupat,2024] Guu, K., Lee, K., Tung, Z., Pasupat, P., & Chang, M.-W. (2020). REALM: Retrieval-Augmented Language Model Pre-Training.
dc.relation.references[Sun z,2022] Sun, Z., Wang, X., Tay, Y., Yang, Y., & Denny, Z. (2022). Recitation augmented language models.. https://doi.org/10.48550/arxiv.2210.01296
dc.relation.references[Tanay d,2022] Tanay, D., Paranjape, B., Hajishirzi, H., & Zettlemoyer, L. (2022). Core: a retrieve-then-edit framework for counterfactual data generation.. https://doi.org/10.48550/arxiv.2210.04873
dc.relation.references[Izacardd G,2022] Izacard, G., Lewis, P., Lomelí, M., Hosseini, L., Petroni, F., Schick, T., … & Grave, É. (2022). Atlas: few-shot learning with retrieval augmented language models.. https://doi.org/10.48550/arxiv.2208.03299
dc.relation.references[Glab,2022] Glaß, M., Rossiello, G., Mahbub, C., & Gliozzo, A. (2021). Robust retrieval augmented generation for zero-shot slot filling.. https://doi.org/10.48550/arxiv.2108.13934
dc.relation.references[Yang z,2023] Yang, Z., Wei, P., Liu, Z., Korthikanti, V., Nie, W., Huang, D., … & Anandkumar, A. (2023). Re-vilm: retrieval-augmented visual language model for zero and few-shot image captioning.. https://doi.org/10.48550/arxiv.2302.04858
dc.relation.references[Kimk B,2023] Kim, B., Seo, S., Han, S., Erdenee, E., & Chang, B. (2021). Distilling the knowledge of large-scale generative models into retrieval models for efficient open-domain conversation.. https://doi.org/10.18653/v1/2021.findings-emnlp.286
dc.relation.references[NT.M.,2020] N. T. M. Trang and M. Shcherbakov, "Vietnamese Question Answering System f rom Multilingual BERT Models to Monolingual BERT Model," 2020 9th International Conference System Modeling and Advancement in Research Trends
dc.relation.references(SMART), [Moradabad, 2020] Moradabad, India, 2020, pp. 201-206, doi: 10.1109/SMART50582.2020.9337155. keywords: {Training;Bit error rate;Systems modeling;Knowledge discovery;Natural language processing;Task analysis;Read only memory;Question answering system;BERT;PhoBERT;DeepPavlov;multilingual BERT model;monolingual BERT model;Vietnamese Question Answering}
dc.relation.referencesGhani and I. K. Raharjana, "Chatbots in Academia: A Retrieval-Augmented Generation Approach for Improved Efficient Information Access," 2024 16th International Conference on Knowledge and Smart Technology (KST), Krabi, Thailand, 2024, pp. 259-264, doi: 10.1109/KST61284.2024.10499652. keywords: {Analytical models;Databases;Virtual assistants;Search methods;Natural languages;Oral communication;Chatbots;Academic Chatbots;Retrieval-Augmented Generation;Large Language Models;Technology}.]¿Qué es LangChain? | IBM. (n.d.). Retrieved May 7, 2024, from https://www.ibm.com/mx es/topics/langchain
dc.relation.references[Stork ai,2024] Descripción general del marco de LlamaIndex | Stork. (n.d.). Retrieved May 7, 2024, from https://www.stork.ai/es/blog/an-overview-of-the-llamaindex-framework
dc.relation.references[Xataca,2023] LLaMA 3: qué es y qué novedades tiene la nueva versión de la IA que se integrará en Facebook, Instagram y WhatsApp con Meta AI. (n.d.). Retrieved May 7, 2024, from https://www.xataka.com/basics/llama-3-que-que-novedades-tiene-nueva-version-ia que-se-integrara-facebook-instagram-whatsapp-meta-ai
dc.relation.references[Victor M,2024] Mixtral: El Modelo de Lenguaje de Código Abierto que Transforma la IA - Víctor Mollá. (n.d.). Retrieved May 7, 2024, from https://www.victormolla.com/mixtral el-modelo-de-lenguaje-de-c%C3%B3digo-abierto-que-transforma-la-ia
dc.relation.references[Microsoft,2024] microsoft/MiniLM-L12-H384-uncased · Hugging Face. (n.d.). Retrieved May 13, 2024, from https://huggingface.co/microsoft/MiniLM-L12-H384-uncased
dc.relation.references[Microsoft,2023] unilm/minilm at master · microsoft/unilm · GitHub. (n.d.). Retrieved May 13, 2024, from https://github.com/microsoft/unilm/tree/master/minilm
dc.relation.references[eweek,2024] 6 Best Large Language Models (LLMs) in 2024. (n.d.). Retrieved May 21, 2024, from https://www.eweek.com/artificial-intelligence/best-large-language-models/
dc.relation.references[Rothman, 2022] Denis Rothman; Antonio Gulli, Transformers for Natural Language Processing: Build, train, and fine-tune deep neural network architectures for NLP with Python, Hugging Face, and OpenAI's GPT-3, ChatGPT, and GPT-4, Packt Publishing, 2022
dc.relation.references[X. Zheng, 2021]X. Zheng, C. Zhang and P. C. Woodland, "Adapting GPT, GPT-2 and BERT Language Models for Speech Recognition," 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), Cartagena, Colombia, 2021, pp. 162 168, doi: 10.1109/ASRU51503.2021.9688232. keywords: {Training;Adaptation models;Bit error rate;Optimization methods;Switches;Computer architecture;Artificial neural networks;Bidirectional LM;GPT;GPT-2;BERT}
dc.relation.references[Y. Liu, 2023]Y. Liu, H. Huang, J. Gao and S. Gai, "A study of Chinese Text Classification based on a new type of BERT pre-training," 2023 5th International Conference on Natural Language Processing (ICNLP), Guangzhou, China, 2023, pp. 303-307, doi: 10.1109/ICNLP58431.2023.00062. keywords: {Training;Knowledge engineering;Text categorization;Semantics;Feature extraction;Natural mining;Chinese TC;BERT model;RoBERTa;BERT-BiGRU}
dc.relation.references[S. Jhajaria, 2023] S. Jhajaria and D. Kaur, "Study and Comparative Analysis of ChatGPT, GPT and DAll-E2," 2023 14th International Conference on Computing Communication and Networking Technologies (ICCCNT), Delhi, India, 2023, pp. 1-5, doi: 10.1109/ICCCNT56998.2023.10307823. keywords: {Training;Computer vision;Analytical models;Visualization;Computational modeling;Training data;Chatbots;Natural language processing;GPT;ChatGPT;Dall-E2;comparative Analysis}
dc.relation.references[I. Goodfellow, 2016] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. MIT Press, 2016, pp. 103-110
dc.rights.accesoAbierto (Texto Completo)
dc.rights.accessrightsOpenAccess
dc.subjectLangChain
dc.subjectRAG
dc.subjectLlamaIndex
dc.subjectNLP
dc.subjectInteligencia artificial
dc.subject.keywordLangChain
dc.subject.keywordRAG
dc.subject.keywordLlamaIndex
dc.subject.keywordNLP
dc.subject.keywordArtificial intelligence
dc.subject.lembIngeniería Electrónica -- Tesis y disertaciones académicas
dc.subject.lembInteligencia computacionalspa
dc.subject.lembProcesamiento de lenguaje naturalspa
dc.subject.lembRedes transformer (Aprendizaje profundo)spa
dc.subject.lembPlanificación de recursos humanosspa
dc.subject.lembAdministración de personalspa
dc.titleImplementación de redes tipo transformer en la selección estratégica de perfiles laborales a nivel empresarial
dc.title.titleenglishImplementation of transformer-type networks in the strategic selection of job profiles at the corporate level
dc.typebachelorThesis
dc.type.coarhttp://purl.org/coar/resource_type/c_7a1f
dc.type.degreeMonografía
dc.type.driverinfo:eu-repo/semantics/bachelorThesis

Archivos

Bloque original

Mostrando 1 - 3 de 3
Cargando...
Miniatura
Nombre:
TovarSanchezJuanSebastian2024.pdf
Tamaño:
2.63 MB
Formato:
Adobe Portable Document Format
Descripción:
Trabajo de Grado
No hay miniatura disponible
Nombre:
TovarSanchezJuanSebastian2024Anexos.zip
Tamaño:
279.56 KB
Formato:
No hay miniatura disponible
Nombre:
Licencia de uso y publicacion.pdf
Tamaño:
216.65 KB
Formato:
Adobe Portable Document Format

Bloque de licencias

Mostrando 1 - 1 de 1
No hay miniatura disponible
Nombre:
license.txt
Tamaño:
7 KB
Formato:
Item-specific license agreed upon to submission
Descripción: