Modelo de Entrenamiento de un Sistema Experto Basado en Reglas Mediante Aprendizaje por Refuerzo Aplicado a la Generación de Trayectorias
Fecha
Autores
Autor corporativo
Título de la revista
ISSN de la revista
Título del volumen
Editor
Compartir
Director
Altmetric
Resumen
In mobile robotics, the task of generating trajectories to a destination point has been approached from various angles with different algorithms. These include heuristic search, graphs, neural networks, and swarm algorithms. Some of these require the generation of a set of nodes that can be organized in multiple ways throughout the environment through which the robot will move. Another type of approach focuses on the nature of the environment which may be fully or partially known, static or dynamic, which means that the route may be planned in advance or must be plotted and adjusted as it makes its way to the destination point.
Artificial intelligence is a relatively new field that was born in the 1950s. Currently, one of its most interesting branches is reinforcement learning. This branch consists of a type of learning in which the artificial intelligence system interacts with an environment from which knowledge can be generated automatically and gradually.
There are currently a large number of artificial intelligence systems that work by means of complex neural networks, in which it is not easy to determine whether the model behind their decisions presents some kind of bias when analyzing the data. For example, the classification of a person's credit risk based on variables such as race or gender, given the poor interpretability of such a model. On the other hand, there are other methodologies that have high model explainability, such as decision trees or rule-based expert systems.
In stratospheric navigation, it is a challenge to guide a vehicle in the descent phase taking into account its fall speed, as well as the random behavior of the environmental variables it faces.
Therefore, a rule-based expert system training system was developed, which is trained by reinforcement learning. With this in mind, it was necessary to generate a training context that implements the task to be solved by the expert system. This environment generates challenges to be learned by the expert system. The proposed system would present an advantage over other models: the possible human intervention in the resulting set of rules, and, therefore, in the model that controls the expert system that makes the decisions. However, the possibility of auditing the system and possible modification will result in a change in the system's ability to solve the task.
As a result, a reinforcement training system was obtained through which a set of rules is obtained. These rules contain the knowledge required by the expert system to establish the trajectory from the starting point to the destination point, avoiding leaving the environment and avoiding obstacles in 76.2% of the validation episodes executed.