Apoyo en el proceso de limpieza retrospectiva de datos publicados a través del SiB Colombia para mejorar su visibilidad, consulta y uso
Fecha
Autores
Autor corporativo
Título de la revista
ISSN de la revista
Título del volumen
Editor
Compartir
Director
Altmetric
Resumen
The National Environmental System (SINA) was established from Law 99 of 1993, during its creation process the Biodiversity Information System of Colombia (SiB Colombia) was created from Decree 1603 of 1994. The SiB Colombia provides open access to data and information on the biological diversity of the national territory with the support of multiple organizations from academia, the private sector, NGOs, the SINA institutes themselves, among others. Through the use of international standards for data publication, such as the Darwin Core (DwC), it is possible to share different types of data through SiB Colombia such as biological records, species lists and sampling events. The DwC standard is strongly consolidated globally and is supported by the TDWG (Taxonomic Databases Working Group) community, who are constantly reviewing and updating it.
The publication of primary data through SiB Colombia facilitates the consolidation of reliable and timely information that supports national and international decision-making on the management of biological resources, research and education. However, for the use of the data to be relevant, it is necessary to improve its quality through different validation and cleaning processes on the three dimensions of data on biodiversity: taxonomy, geography and temporality.
Taking into account the above, in this internship a retrospective cleaning process of data published in the IPT (Integrated Publishing Toolkit) of SiB Colombia was carried out, prioritizing the data sets published by the Alexander von Humboldt Biological Resources Research Institute ( IAvH), in order to improve its quality based on prioritized elements of the DwC standard and the correct documentation of the metadata in a way that guarantees its correct visibility, consultation and use, subsequently a geographical review of the data sets was carried out. published in the IPT of the SINA research institutes (IAvH, IIAP, Invemar, Sinchi and PNN) to verify the coherence of the location of the coordinates reported with respect to the superior geography of the data, with this review a report of quality and geographical for each institute and finally a python script was developed for the generation of quality diagnoses that can eda be replicable for future processes.