How can we use data to make the Web more trustworthy and improve personal safety and security online


Challenge identifier: SC2-2017



The Web is currently going through a critical phase with a number of trends being seen to threaten the spirit in which it was originally developed. In some quarters there is increasing dissatisfaction with the amount of misinformation being shared online – a phenomenon recently identified as “fake news”. At the same time, the original values of the Web in relation to openness and tolerance have been challenged by an increase in cyber bullying and hate speech. Finally, the freedom which was envisaged for individuals online is being threatened by personal privacy and security concerns, in some part driven by the business models adopted by some of the main commercial players. These concerns were raised by none other than Sir Tim Berners-Lee, the inventor of the World Wide Web in an open letter to mark the webs 28th anniversary.

Data based products and services will play a vital part in creating a more tolerant and open Web, as well as showing how commercial value can be created by helping combat these concerns. For instance, algorithms and AI could assign trust scores to websites based on the content, could cross-reference facts across the web, as well as predict the likelihood of information being “fake” based on other data. To address privacy concerns data could be used to develop services which raise awareness of current laws and regulations, simplify the terms and conditions of websites, predict and prevent hateful comments and bullying and inform users of any possible privacy vulnerabilities.

We are particularly interested in solutions that leverage closed and shared data to:

  • Identify, define and predict “fake news” as well as provide a means for fact checking information prior to publishing;
  • Build tools and services that alert people of privacy vulnerabilities;
  • Analyse and prevent cyber bullying and hate speech;
  • Create solutions which combat algorithm bias; and
  • Provide means for checking compliance when new laws and regulations are produced.



Examples of data include but are not limited to:

  • digital archives
  • data from content producers and consumers
  • data from news agencies
  • customer segmentation data

It is worth keeping in mind that some European data must be used in this challenge. Social media data is allowed, as long as a complementary European data source is also used.


Expected outcomes

Examples of outcomes may include but are not limited to:

  • new apps and services (web, mobile etc), pre-publishing tools for news and online articles
  • new algorithms (for information propagation, provenance mining, prediction, content recognition etc.)
  • new intermediary technologies to integrate data sources
  • new tools and business processes to help decision making that make insights of algorithms easier to understand
  • Registries and distributed ledger applications
  • new forms of hardware (e.g. wearables, VR, AR).


Expected impacts

Participants will need to provide details on the impact measurement framework they would use to show how their solution:

  • helps people get better access to high quality information;
  • improves the accuracy of news and information online;
  • improve awareness of misinformation online.
  • make people feel safer online; and/or
  • make people more aware of how their data is being used and where there are vulnerabilities.