As I started my secondment at KU Leuven, I took the opportunity to attend the Free and Open Source Software Developers’ European Meeting (FOSDEM) 2024, which was held at Université Libre de Bruxelles on February 3-4. FOSDEM started in 2000 and has hosted more than 5,000 attendees yearly. It has been regarded as one of the best open-source developers gathering in Europe and beyond.
In this year’s edition, 59 tracks ran in parallel, and several presentations in the Open Research track are particularly relevant for open data research. I especially enjoyed three presentations: “PHAIDRA: A repository where research data goes live (not to die)” by Raman Ganguly from the University of Vienna, “Updating open data standards” by Sara Petti from the Open Knowledge Foundation (OKF), and “Wikimedia projects and OpenStreetMap as an open research infrastructure” by Iolanda Pensa from Wikimedia Italia.
From their presentations, I reflect on six short insights:
Insight 1: Define clear goals and roadmap of open data projects. Sara Petti highlighted that one of the key takeaways from the ongoing project by OKF called Frictionless Data, in which she is in charge of managing open standards development, is to specify the milestones clearly to be agreed by all stakeholders in the project.
Insight 2: Involve stakeholders from the development stage. Sara Petti and Raman Ganguly emphasised the need to include stakeholders from the early stage of open data initiatives. Raman Ganguly is responsible for developing the University of Vienna’s research database, PHAIDRA. He shared that the initiative has included various stakeholders since the start to ensure it is useful for actual use cases.
Insight 3: Balancing simplicity and catering to diverse needs in open data initiatives is challenging. An attendee asked Sara Petti how her project decided on common data standards, given that from experience, involving diverse stakeholders also means different needs that need to be negotiated. She admitted that it is indeed tricky, but her project took the approach of reducing the complexity of the standards as much as possible and focusing on the aspects most commonly shared.
Insight 4: Do not reinvent the wheel. Sara Petti mentioned that the approach of Frictionless Data is to look for what is already out there (standards, practices, etc.) and try to develop new ones based on them. This would also facilitate the adoption of the new initiatives.
Insight 5: Define the rules of collaboration in open data initiatives. It is essential to define how a collaboration would be conducted, including how decision-making is made. Sara Petti also noted that sometimes consensus-based decision-making may not be ideal in order to get things done.
Insight 6: Legal aspects for certain open data initiatives are (still) quite complex. Iolanda Pensa shared the multiple layers of copyrights that they need to take into account in the GLAM (Galleries, Libraries, Archives, and Museums) open data project undertaken by Wikimedia: when they first identify a cultural heritage database, they need to check if the database is CC0; for photographs of any of the cultural heritage items, they need to check the authorisation of the photos by the photographer (CC0, CC BY, CC BY-SA); they also need to check if the owner of the building or the artwork allows for photographs to be taken at the first place; and they need to check the rights of the artist or architect which has produced the artwork or the building (including e.g., freedom of panorama, which differ by jurisdictions).
To sum up, some issues in open data initiatives that researchers and practitioners have long identified persist. However, practitioners have also found (or experimented) various ways to overcome these issues, from which researchers can study and draw lessons. Eventually, these lessons should also be communicated back to practitioners so that they can learn from other open data initiatives instead of being documented in inaccessible research papers.
For my PhD project, I look forward to engaging more with practitioners on the ground and sharing my research with them.