Jeffrey Manuel – South African National Biodiversity Institute
As part of its mandate in terms of the Biodiversity Act, SANBI must maintain databases about biological diversity, as well as coordinate and disseminate biodiversity information. SANBI currently holds over 20 million biodiversity records, of which less than half are SANBI’s own records. SANBI is, therefore, a research institution, but also a research infrastructure/repository for the collation, storage and dissemination of biodiversity information prepared or generated by data providers. This data sharing has typically been protected by data-sharing agreements, but much of this preceded the internet age, and the data provider may not have intended for how data is being shared now. Additionally, data-sharing agreements were often bespoke, which also does not facilitate the automated manner in which data sharing happens now. Furthermore, data sharing historically does not account for the legal requirements in terms of Intellectual property rights, publicly financed research, or Protection of Personal Information (POPI). These all pose ‘legal’ issues that need contemporary, revised data-sharing agreements. On a more practical note, data-sharing has also happened with a specific use-case in mind. This meant that sharing data without provenance (the ‘story’ behind the data) was fine as the user knew what they were getting and what they wanted to use it for. In the era of big data, where hundreds of datasets are being ingested for analyses, provenance has become crucial. For more complex datasets or products (like conservation plans) this means technical reports, but in most cases it simply means metadata. A major cultural shift is required for data owners to compile and maintain metadata for their datasets, to facilitate data being used as widely as possible. Lastly, protection of intellectual property, data standards, data quality, licensing, and attribution and citation issues are also matters that need to be understood and agreed to when concluding data-sharing agreements. This talk will focus on the latest thinking and data sharing initiatives from SANBI.