The benefits of data sharing
Sharing data has a variety of benefits in different areas, and different stakeholders benefit from data sharing in different ways.
The most obvious benefits of data sharing are economic – after all, it is often companies that hold and share data, and data users invest their time and efforts into developing marketable products and services. Working with data promises a variety of new products and services, jobs, business intelligence, or efficiency savings. Data sharing can also unlock benefits for the environment and to society at large, or lead to increased tax revenue.
We will now define and discuss each of these roles – data holders, data users, intermediaries and society – in turn.
For data holders
Data holders are organisations that supply the data in a data sharing relationship. They hold – or have control over – data, and they may or may not ‘own’ it; for example, the data subjects might remain owners of the data an organisation holds about them, but the organisation has a right to use this data. A data holder might be a corporation or business, a department within an organisation, an NGO, a research consortium, a public entity, or any other organisation that holds data from any source.
Data holders may want to share their data for several reasons: to solve a business problem that they lack the skills to deal with in-house, to gain a competitive advantage by improving their data quality or products, or to explore what can be done with their data. Sharing the data with a data user – or possibly just with another department within the organisation – can provide new, creative ideas of what to do with or how to process the data. For data holders, the main benefit of sharing their data is to gain efficiency savings, develop new or improve existing products, create new or better services, solve existing or future business problems, or understand what is in their data. They may not have the expertise to develop these solutions internally, or it may not be economically sensible for them to work on the data themselves. By sharing data they can also get a glimpse into a developing market, in order to remain competitive.
Additional benefits of data sharing for data holders:
- improved internal data structure
- increased legal compliance
- skill development
Our work with data holders has shown that their motivation to share data can be categorised in two ways: They may be seeking a solution to a specific problem, such as improving customer recommendations, or making a process more efficient; or they may be after exploration, with the goal to find out what can be done with their data in a specific area, such as developing new value propositions or products from customer data.
Case study: Exploring data with Greiner Packaging International GmbH (GPI)
GPI was a data holder in the second call of Data Pitch. The company has a post dedicated – and with decision-making authority – to advance data driven innovation. This made it easy for them to join the programme and make a large variety of data available. Their challenge was to use the sensor data from three of their manufacturing plants to develop solutions that enhance the business in terms of manufacturing, logistics, supply chains, or even sales. This breadth of opportunity, and their capacity to fully commit to the innovation process, resulted in their partnering with five data users that are now working on advancing different areas of their business.
For data users
Data users are organisations that use data that is shared by a data holder to develop new insights, products or services. In Data Pitch, these users were typically innovative start-ups or small to medium enterprises, but they could also be another organisation, a different department in the same organisation, a university, students, individuals, or activist groups. Their main benefit in data sharing is access to data which they or their competitors would otherwise not have, which allows them to generate new insights, develop new or improve existing products or services, and establish themselves in the market. In other cases, access to vast data sets allows the data users to increase their deep / machine learning and artificial intelligence capabilities.
Additional benefits of data sharing for data users:
- business relationship with data holder (as clients, investors or other partnership)
- insight into new markets
Data users may also be data holders, and the sharing relationship in that case may be reciprocal. Organisations could swap their data, pool it for a mutual benefit, or the data user could supplement data that is shared with them with their own data. The latter was the most commonly observed situation in Data Pitch: An innovating user would combine the data provided by the data holders with their own proprietary data, to produce a solution that is mutually beneficial to both organisations.
Case study: Combining data with IPlytics and SpazioDati
IPlytics is a German start-up specialising in business intelligence. They joined the Data Pitch programme in 2018, responding to the challenge by SpazioDati, who were looking to enhance their existing business intelligence knowledge graph. IPlytics’ proposed solution was to supplement SpazioDati’s extensive data on various business sectors in Italy with their own data, which in turn amalgamates different sources of public data, such as patents and research publications. Their platform can be used to identify and act upon future technology trends. During the course of their participation in Data Pitch, both organisations gained a significantly improved database for their respective platforms.
“What excites us most is the possibility of enriching our data with the data provider’s external data, as well as the possibility of having a real impact by adding value to their data in turn.”
Rosann Brandt, IPlytics COO
Intermediaries are not a necessary part of data sharing, but they play a role in many data sharing relationships. There are many possibly forms and roles of intermediaries, but as a general rule, they engage in-between data holders and users, and enable or help to scale the data sharing process in various ways, which we outline below.
Intermediaries will want to achieve their own specific goals, which can be defined by their members, funders, shareholders, or other decision-makers. Often this goal will be revenue: services to enable data sharing relationships are a marketable product in themselves. In other cases, such as Data Pitch, the European Commission set a goal to grow the EU data economy, to increase tax revenue, create jobs, and make the EU more competitive in data markets. Other organisations, such as NGOs or universities, might want to improve data protection, or increase the use of research data.19
There may be downsides to having an intermediary involved in a data sharing relationship. For example more due diligence might be required, if either public bodies or funding are involved. Such intermediaries may be held to a higher standard of accountability, increasing the required resources for checks and balances. Similarly, when an intermediary increases the efficiency in data sharing relationships, this will happen through standardisation. Consequently, there could be less freedom between the other stakeholders for the terms under which they can share data. That could be seen as a disadvantage – not using the intermediary allows the parties to draw up a contractual relationship that may fit their needs more precisely. However, this also means that they would have to do all of the work involved themselves, making them less able to scale their engagement.
Case study: Enabling data sharing at the Alan Turing Institute – Data Study Groups
The Alan Turing Institute arranges week-long ‘collaborative hackathons’, which bring together talented researchers from a variety of backgrounds, such as data science or artificial intelligence, with industry problem owners. Organisations define a challenge, provide a dataset, and pay a fee to engage. The institute recruits PhD researchers from a variety of domain and data science backgrounds to work on the challenge for one week.
The data holders get to quickly prototype possible solutions to their challenges. The researchers get an opportunity to put knowledge into practice and go beyond individual fields of research to solve real world problems. As the intermediary, the institute gains industry collaborations, with the ideas generated acting as seeds that can kick-start larger collaborative research projects.
Types and roles of intermediaries
There are a variety of types of intermediaries. For the purposes of this toolkit, we class as intermediaries any third party organisation or platform that facilitates the sharing of data between one organisation and another.20 Dependent on how much of the process they facilitate, these could range from individuals to institutions to associations. Which form they should take is currently being explored, through projects such as the data trust pilots by the Open Data Institute21 or data collaboratives by GovLab.22 For example, a data trust could be an institution that pools data from individuals and then negotiates terms for the use of this data on their behalf; it could be an innovation programme such as Data Pitch, which distributes development funds from the European Commission; or an institution that governs the sharing of data with a statutory oversight. It could be a framework of rules that enables data sharing, a legal construct, an organisation, or a data store.23
The International Data Spaces ecosystem aims to be a de facto market standard for the trade and exchange of all kinds of data assets. It facilitates the finding and authentication of appropriate transfer partners and also the legal and commercial governance of transactions.24 The Big Data Value Association iSpaces label identifies platforms that facilitate the sharing of closed data for various innovation purposes.25 They include the Big Data Centre of Excellence in Barcelona and the Smart Data Innovation Lab in Germany.
Different concepts of intermediaries envisage them as performing a variety of roles and consequently the benefits they provide, and the associated costs, will differ. There are a number of roles they could fill:
- Enabling scale and capacity: Intermediaries can help to scale data sharing relationships. Rather than developing 1:1 relationships between every data holder and user, an intermediary could bundle data holders, data subjects, or even the actual data, and make it accessible to data users at specified terms. This may entail managing the physical access to the data, or applying a framework of rules and obligations under which access to data is granted.
- Reducing complexity: Data sharing can be a demanding process, and at least initially be expensive. Setting up a sharing relationship takes a lot of time from a variety of internal stakeholders and experts, particularly in large organisations with complex hierarchies and decision-making structures. Intermediaries can design and apply institutional processes and regulations, and conduct due diligence checks to comply with legal requirements such as GDPR.
- Matchmaking: Intermediaries could identify suitable matches between data holders and users, depending on the type of data they offer or seek, and facilitate either or both of these relationships. This is what data marketplaces already do, and Data Pitch did to some degree.
- Providing infrastructure: Intermediaries may provide the necessary infrastructure for data sharing, although in our experience this can be treated as a commodity. There are many solutions available on the market through which data can be shared, and most data holders and users shared data through platforms or servers that they already had access to.
- Creating trust: Having a third party involved between a data holder and a data user can make negotiations easier. The intermediary could act as an arbiter, ensuring that both sides get sufficient benefit from the relationship, or conduct the decision making process that determines eligibility to engage in data sharing, for example by assessing data user pitches.
- Supporting: When an intermediaries’ main task is to grow a specific market, or address a specified set of challenges, they might focus on supporting data holders and/or user, for example by supplying templates, conducting necessary checks, or even supplying funds. Such an intermediary would likely be funded by a public body or investor, or any other entity that has an interest in growing a specific market area. Intermediaries may thus help data holders or users to acquire the skills required to engage in data sharing; Data Pitch has demonstrated this in the context of GDPR-compliance.
- Developing best practice: Due to their central role in data sharing relationships, intermediaries can specialise, and generate and apply best practice, making data sharing both easier and more cost-effective. This knowledge can be put to further use, for example in advising policy.
While society does not have a specific role in data sharing, it is a vital participant. There are a number of benefits to society at large that can result from data sharing. First and foremost, if innovation through data sharing improves products and services, while this will typically be motivated economically, the public also benefit from having those new or better products and services available. For example, customer service experiences are improved through chat bots or recommendations. Another area of innovation is health, where diagnoses or provision of care can be improved, and health services made more efficient and customer focussed. Data sharing can also contribute to a safer, cleaner environment, and even help tackle climate change. Many of the social benefits of data sharing double as environmental. For example, data sharing can help to improve supply chains, which in turn reduces the unnecessary transport of goods. When data is shared to develop systems that reduce emissions or energy consumption in buildings, this in turn has a positive impact on air quality and public health.
For public sector data holders, data sharing can help to achieve goals of public interest, such as more secure roads. Nine out of Data Pitch’s 47 data users aimed for environmental effects, for example by improving traffic flows or maintenance works, both of which
could contribute to both better services, a safer urban environment, as well as reduced emissions. Safety was the focus of a number of other projects, which aimed to give citizens better control of their data, improve their privacy, and organisations’ compliance with the GDPR. Enhanced insight through data can also be used to make more socially-responsible and environmentally-friendly decisions within organisations, and lead to better policy decisions among state actors.
In a wider sense, society or the data community benefits from data sharing, as new jobs are created. If new algorithms or other AI insights are published openly, the ecosystem can benefit further as learning is accelerated. This will increase awareness, as well as the quality of data and data processing. Case studies of successful data sharing will also make other organisations more likely to engage in future data sharing activities, increase the availability of data for everyone, grow the data ecosystem, and turn data sharing into a more common practice.
In the next section of the toolkit we demonstrate how data holders, users and intermediaries interact to create economic value and develop the data ecosystem in Europe, through a case study of the Data Pitch data innovation programme.
Case study: Met Office addressing air quality with GoSweat and Hop Ubiquitous
Data users GoSweat and Hop Ubiquitous are working with MET data provided through Data Pitch, to improve the use of pollen and air pollution data, and ultimately impact public health. Hop Ubiquitous is building a decision support system to help public servants and citizens make more environmentally aware decisions. GoSweat’s application allows end users with hay fever to plan their exercise around pollen forecasts.
“We believe in sharing our data and enabling others to use it. We see our involvement with Data Pitch as a key to making data more available and usable. [The ideas developed through Data Pitch should support] UK citizens, by making life easier, protecting them, helping them prosper or improving well-being.”
Richard Carne, Chief Digital Officer, Met Office
Case study: Deutsche Bahn and Ubiwhere enhancing transport flows for economic and environmental benefits
Ubiwhere uses data provided by Arriva (the UK arm of Deutsche Bahn) to improve their mobility solution. Combining historical booking data, with data about external factors, such as weather, points of interest, or seasons, they provide business intelligence that will allow their customers to identify and proactively resolve inefficiencies.
“We want to improve the punctuality of our bus services based on outside influences such as traffic flows, weather, events and unforeseen incidents and how they impact schedules, and to see what we could do to be not only reactive, but also proactive – so for example if a car breaks down on one of our routes we can find out in advance and address it with diversions for following buses. Ubiwhere used external data points to see how they affected journey planning. This included traffic light networks, crowdsourced data for traffic flows and weather patterns.”
Stuart Walker, Senior Product Manager, DB/Arriva
19 Lopez de Vallejo, I, Scerri, S, Tuikka, T (eds) (2019): Towards a European Data Sharing Space. Brussels. BDVA Back to text
20 However, we should note that some organisations thus classified see themselves as enablers of direct sharing, rather than as directly intermediating the relationship. Back to text
21 https://theodi.org/project/data-trusts/ Back to text
22 https://datacollaboratives.org Back to text
23 Hardinges, J (2018): What is a data trust. Open Data Institute Back to text
24 https://www.internationaldataspaces.org/ Back to text