Bringing Metrics to a polluted information ecosystem

We are at a watershed moment in the internet’s history. The #MeToo movement, the Parkland teenagers, and others have demonstrated the profound power of democratized, decentralized media. Open data and open access have enabled insights and innovations that would have encountered massive barriers in the past. And the internet has brought a voice and connection to billions of people previously without one.

But as with previous technology revolutions, the shadow side of the progress is dark. Disinformation is one of the most dangerous shadow sides to the open internet. The weaponization of mass influence to achieve political or intolerant social ends has made Disinformation public enemy number one across much of the world. Governments (mostly) hate it as it misleads and polarises electorates and lowers trust. Advertisers (mostly) hate it as their brands can be damaged by being seen next to false content. The media (mostly) hate it as it cuts right across the ethics of good journalists to publish something fake or misleading, and it makes it harder for reputable media to attract the ad dollars that are being siphoned off by the less reputable ones. The public (mostly) hate it as no one likes being duped. And the social media companies (mostly) hate it as they are at risk of losing ad dollars as advertisers pull money off services polluted by the fake stuff. And they fear it as governments, particularly in Europe are gearing up to regulate social media networks unless they can demonstrate giant strides in reducing the spread of disinformation, hate speech, abuse, and extremism. Given the recent revelations about how social media platforms are being used in ways antithetical to democracy, the risk of ill-conceived regulations and their unintended consequences just got higher.

Worst of all, disinformation can over time undermine faith in institutions, government, and even democracy itself.

Given the dangers of disinformation and the fact that it touches such a wide range of stakeholders it is not surprising that there are a plethora of initiatives to try to stem it’s flow. From trusted journalism to fact checking, from news quality scoring to credibility indicators, people are trying many routes to banish the bad stuff. We believe in the old Peter Drucker mantra of “You can’t manage what you don’t measure.” We want to measure disinformation, or at least the risk of disinformation, and thereby enable steps to reduce it.

Our organisation — The Global Disinformation Index — is trying to amplify the work of many of the other initiatives and bring some much needed metrics to this field. We are working to create the first global rating system for the world’s media that will assign a rating to each source based on the probability of that source carrying disinformation. In much the same way as credit rating agencies rate countries and financial products with AAA for low investment risk all the way to Junk status for the most risky investments, so the index will do for sources large and small. It will allow media outlets to demonstrate an external benchmark of their veracity to reassure readers and advertisers. We hope social media platforms will allow those ratings to be seen when content is shared on their platforms, giving those who see their news from social media feeds, rather than primary sources, more context about what they are about to read. And we hope to become the definitive authority when it comes to tracking Disinformation as it spreads online.

By aggregating all the ratings of outlets in a country we can determine an overall disinformation risk rating for that country. Over time this will begin to show us trends in the geographic spread of disinformation around the world, highlight hotspots, and identify countries that are successfully tackling the problem. An example of what the output could look like with dummy data is shown below.

A probability risk rating is not a value judgement. It does not censor. It allows people to make their own, informed choices about how much credence to put into what they engage with, just as we are free to invest in junk bonds, despite the heightened risk.

To be useful the index needs to be able to do this in real time and in all languages. We are working with a team of AI experts to try all the approaches out there for automatically detecting and indexing signals from the sources. We will look at meta-data around the source (country of origin, technologies used, standards adopted, etc.) and such signals as the frequency of publishing, the number of articles on the site and the average hits to each article. We will look at signals from the content itself such as stance detection (is the language used neutral or emotive), profanity detection, “clickbaity-ness,” and overall style. We will look at the network distribution of the content from that source — who shares it and where. And finally, we will look at state and corporate sponsorship of content, both covert and overt. No one signal will be the key, but by building up a picture of many signals we are confident we can, in real time, give a probability rating of the risk of a source carrying disinformation.

Our early proof of concept has been built with the kind support of the Knight Foundation, and the results are encouraging. Trained on a list of known disinformation domains that have been previously fact checked, we’ve demonstrated that a neural network is able to correctly identify a site as high risk of disinformation 98% of the time. These are early results and the sample size is small, but we are now looking for funding to scale the methodology up to an English language prototype, followed by a global expansion.

This disinformation risk detection has not been done before at scale, automatically, and in many languages. We strongly believe this is such an important issue that it needs the world’s collective collaborative efforts to solve it. That is why we have set up The Disinformation Index as a not-for-profit venture to act as a “big tent,” bringing the best minds together to solve one of the worlds most pernicious problems. We aim to use many of the existing initiatives as inputs into the ratings, thereby amplifying much of the great work being done in the field.

There is a great need for someone with no skin in the game to assess which sources carry the greatest risk of disinformation. This is a role that no established actor can legitimately fulfill. No one wants a government backed “Ministry of the Truth”, or the “facts as decided by Facebook”. We believe that a new entity with no ties to political or commercial agendas is needed. The Global Disinformation Index will take that role.

Given the sensitive nature of Disinformation, the governance of the index will be as crucial to its usefulness as the technology underpinning it. We will build the index on three governance pillars:

  1. Neutrality — The Index will be governed by a Board of Trustees with no political or geographic bias and no commercial ties to any media companies as well as an Advisory Group of experts in the field from around the world.
  2. Independence — We will strive for a broad base of funding from governments, companies, foundations, and individuals with no single funding source contributing a dominant portion of the total funds.
  3. Transparency — The methodology behind the risk ratings will be freely available to all. The criteria which determine the ratings will be public

This is a huge, cross sectoral, cross country effort to solve one of our generation’s greatest challenges. We need all the help we can get. If you are working on any aspect of the disinformation challenge, especially if you a data scientist or student working in the fields of AI and machine learning, we would love to hear from you. Please sign up to our mailing list at

