Open data are crucial for scientific knowledge production, transparency and accountability, as well as innovation. The value of open data is multi-faceted, and include social as well as economic value. This includes the creation of new open datasets (often by combining or adding to existing open datasets), identifying social issues of concern, creating new technological infrastructures for open data, offering educational and awareness activities related to open data, and new data-driven products and services. However, open data initiatives and infrastructures also face many challenges ranging from lack of financial support to limited participation of non-government data holders, which impact their sustainability.
The ODECO (Towards a sustainable Open Data ECOsystem) consortium, a Marie Curie European project gathering 15 early career researchers, 20 academics from 8 universities and institutions, and 18 public and private partner organisations, conducted a 4-year interdisciplinary training and research programme on ‘sustainable open data ecosystems.’ The research and outputs of ODECO contain valuable practical recommendations on technical, social, economic, legal, governance and design aspects of different components of open data ecosystems. Below, we synthesise these practical recommendations for open data researchers, infrastructure builders, as actors, funders and policymakers. The following set of recommendations is meant to support the future of open data, by imagining, building and maintaining open data ecosystems that are financially, socially and ecologically sustainable, equitable, and empower all stakeholders.
Data Quality
1. Discoverability

Open data initiatives should create multiple access modalities, ranging from open data portals to open APIs. Policymakers should also encourage the adoption of technical openness in data publication by advocating for machine-readable formats. To enhance the accessibility of open data for non-technical users as well users in low-resource settings, governments and open data researchers should also support the development and distribution of low-code / low-tech data analysis tools.
2. Metadata

Enhancing metadata quality boosts data discoverability. Policymakers should continue to advocate for metadata standards, ensuring datasets include comprehensive descriptions, provenance, and structured classifications. To this extent, emerging technologies such as artificial intelligence can be leveraged to generate core metadata automatically, thereby reducing the burden on open data providers.
3. Ongoing engagement and participation

Open data initiatives should be attuned to stakeholders’ needs . A priori, open data initiatives should undertake ecosystem mapping to identify different stakeholders and their needs. They may do so using tools from the discipline of design thinking and theoretical principles from the discipline of information visualisation and communication. This can engender both technical data openness as well as social equity in open data initiatives. On an ongoing basis, open data initiatives should create robust feedback loops. This can include digital design strategies to create participative interfaces in open data portals, such as comments sections or the ability to propose edits to open datasets. Open data initiatives could also adopt participative design strategies, to create spaces for community discussion and deliberation on data re-use. Examples include open data game jams, data physicalisation, data sprints, citizen or participatory science, and game-based classroom learning pedagogies. A posteriori, open data initiatives should also undertake evaluations. This can include automated validation tools, periodic audits, quality dashboards, and collaborative stakeholder engagement to maintain high data quality standards and to assess whether open data initiatives are discharging their original stated objectives
Wide reuse
4. Beyond open government data

In addition to open government data, non-government stakeholders are also an important category of data holders. Policymakers should advocate for legal frameworks that require both government and non-government data holders to release open datasets as well as enable wider reuse of data, especially for public interest purposes such as research or prevention of emergencies.
Public procurement contracts can also be leveraged to obtain more open datasets from commercial data holders. The City of Barcelona included ‘data sovereignty clauses in public procurement contracts with vendors contracted to provide services to the city, which required such vendors to share all data generated in the course of providing the contracted service in an open machine-readable format with the public administration, so that this data can be released as open government data. A similar example exists in the Netherlands, in public procurement contracts between the public agency Rijkswaterstaat and private contractors for water infrastructure projects.
Licenses have also been central to open data ecosystems. Non-government data holders (particularly commercial actors) should also be incentivized to use open data licenses. The open science movement has resulted in open sharing of research artefacts, and holds valuable strategies for other types of non-government data as well. Where the data in question does not relate to any personal or sensitive information, broad licenses should be used that impose little to no restriction on reuse should be used for government as well as non-government data when possible. While communities contribute to open datasets, commercial re-users may offer little to no value back to the maintenance of these datasets or to the preservation of open data ecosystems. In such cases, open licenses that impose stronger copyleft obligations on re-users can also serve as helpful strategies to respond to the current political economy of data re-use, while preserving a culture of openness.
Other incentives can also be used to obtain open data from non-government data holders. Non-government data holders such as open data intermediaries may be reluctant to share open data because of their business interests. Governments may consider offering financial incentives (such as tax credits) to companies. Such financial incentives could also incentivise participation of non-profit and non-commercial actors. Non-government stakeholders can also be tasked with providing additional services to open data initiatives, such as trainings and capacity building tools. In general, government and non-government stakeholders should invest in capacity-building initiatives that can enable non-government data holders as well as non-specialised users to share more open data. Finally, continued advocacy on the value of open data is necessary, to create a culture where such data holders are motivated to release their data as open data.
5. Data Literacy

Users of open data – ranging from NGOs, journalists, non-specialist users and open data intermediaries – require a broad level of skills to generate value from open data. Not all users possess the same data literacy, and governments should invest in building data literacy and digital equity. In this regard, the education sector can be leveraged to improve data literacy across populations. Further, government actors themselves require regular trainings and guides on certain aspects of open data, such as interoperability. Data science trainings for governmental and non-government actors, as well as simple and accessible step-by-step guides for citizens can be useful. Design strategies to improve data literacy, through the use of open data game jams and data physicalisation, can also be useful.
6. Mind the gap(s)

The generation of open data is not a neutral activity. The existence of more open datasets, released by both government and non-government actors, can enable more uses of open data, data-driven policymaking and realisation of more value from open data. At the same time, the volume of open datasets does not always mean that these datasets are representative of the diversity of human experiences and social phenomena. The generation of open data comes with problems of missing data, particularly with regard to data about vulnerable or historically marginalised groups. Here, an ‘open data justice’ approach can be useful, to assess the extent to which open data initiatives are representative of various realities and the extent to which they allow for participation by a diverse range of stakeholders. Participative processes for the generation and use of open data are also necessary, with due regard for accessibility. Open data initiatives should continuously acknowledge and account for the power dynamics in the generation and reuse of data.
Infrastructures for open data
7. Usability

Open data portals maintained by institutions as well as by different governments across regional, national and local levels, are important modalities for access to open datasets. Open data portals should conduct routine evaluations of user experience, and adopt iterative interface design practices. Features such as screen reader compatibility, high-contrast visuals, and accessible navigation cater to diverse user needs, including those with disabilities. Careful use of artificial intelligence and collective intelligence technologies can improve the functionalities of open data portals.
8. Interoperability

Maintainers of open data portals should participate in and adopt interoperability standards, to extend the reach of these portals to everyday information search scenarios on other platforms. Governments and organizations should establish mandatory compliance with widely accepted interoperability standards such as DCAT, FOAF, and Schema.org. Governments and organisations should also prioritize the adoption of Linked Open Data principles to strengthen semantic interoperability. At the European level, adopting established classification frameworks, such as those used by the European Data Portal as well as those specified under the Interoperable Europe Act, facilitates interoperability and alignment with broader data ecosystems.
9. Public administrations’ investments

Public administrations should adopt a more active approach towards open data initiatives. Funding for shared open infrastructures, including digital public infrastructures, that limit infrastructural dependencies on commercial actors are necessary. Public administrations should also build and maintain infrastructures that allow for different stakeholders to publish, access and reuse open data, as not all actors may have the resources necessary to develop such infrastructures from scratch. Within the European Union for instance, publicly-funded infrastructures such as the European Open Science Cloud, could enable various government and non-government actors to come together and use this infrastructure to public open datasets, analyse open datasets, as well as generate knowledge from open data.
Public administrations should coordinate civic projects involving themselves and other actors such as academic universities, civil society organisations, and commercial data holders. This can result in the formation of communities through open data, as well as collaborations between these actors to generate value from data. For instance, the Glasgow Centre for Population Health coordinates civic projects involving local administrations, academic universities and civil society organisations. This led to the creation of ‘Understanding Glasgow – a website that hosts visualisations on health and life circumstances, encompassing visualisations on poverty, transport services, population and culture to name a few.
Primary authors:
- Ramya Chandrasekhar, https://orcid.org/0009-0007-1313-1551
- Mélanie Dulong de Rosnay, https://orcid.org/0000-0002-0297-7603
Commentators:
- Ashraf Shaharudin, https://orcid.org/0000-0001-8640-6420
- Héctor Ochoa Ortiz, https://orcid.org/0000-0002-6477-0683
- Bastiaan van Loenen, https://orcid.org/0000-0001-8847-6334
- Caterina Santoro, https://orcid.org/0000-0002-6117-6566