Modelo de detección de ataques DDoS (distributed denial of services), con base en el clasificador decisión tree.
Fecha
Autor corporativo
Título de la revista
ISSN de la revista
Título del volumen
Editor
Compartir
Director
Altmetric
Resumen
Intrusion Detection Systems (IDS) are important in the field of security for modern computer systems that are connected through information networks, as well as in the mitigation of possible damage in case of attacks of any kind. IDSs observe events in a network for subsequent decision making. Due to the popularity of the Internet and local networks, incidents of intrusion into computer systems are increasing, the rapid expansion of computer networks creates a need for the development of systems that can reliably detect network threats with a high detection rate and a low false positive rate under normal computational resource usage conditions. To meet the identified need, with the present degree project under the monograph modality, it was proposed to develop an intrusion detection model for DDoS attacks, which was implemented using the Machine Learning Decision Tree classification technique; the model was developed based on a dataset called CIC-IDS 2017 developed by the University of New Brunswick, Canada that contains updated information of common attacks in a real environment, and from which its main characteristics were selected in order to optimize the learning of the system. The developed system is a Python language script composed of different stages: Data Collection, Data Pre-Processing, Exploratory Data Analysis, Data Set Reduction, Algorithm Selection, System Training, System Evaluation. The result (trained system) has the ability to process a new dataset with the same structure as the training dataset and make classifications to categorize a traffic capture as a normal connection or as an attack. After analyzing the resulting metrics of the trained system with the full dataset, and comparing the results with other Machine Learning algorithms, it can be concluded that the Decision Tree algorithm did not present the best performance compared to other techniques, in this sense the best performance was obtained by the K- Nearest Neighbor algorithm (K - Nearest Neighbors) presenting an accuracy in its classification 53% higher than the Decision Tree algorithm. However, a subsequent training was performed with the Decision Tree technique by varying its hyper-parameters, thus achieving a 49% increase in the accuracy of its classifications and putting it on a par with other algorithms. Regarding the results of the system trained with the data set reduced by the WEKA software under the correlation criterion, it was observed that Decision Tree improved its accuracy metric by 36% compared to the training of the complete data set, being among the best performing algorithms together with Random Forest.