How Senate Matching transforms privacy-preserving matching

CLIENT: Data Republic

Original article: datarepublic.com

Published: 2019

Between tireless bad actors and increasingly stringent regulations, businesses can no longer afford a business-as-usual approach to data matching

Across the globe, growing numbers of public and private-sector organizations are using data matching to improve customer service, reduce fraud and boost revenues. Data matching can be one of the most powerful silver bullets in any organization’s armoury.

But like so many other data-related tools, data matching has long been bedevilled since its birth by the utility-privacy trade-off. That is, organizations have had to choose between prioritizing the security of their clients’ Personally Identifiable Information (PII) or extracting the maximum benefit from sharing their data sets.

Old-school data matching

Back in the bad old days, organizations tended to cluster at the ends of the privacy-utility spectrum. The two nuclear options commonly taken were either keeping data sets firmly under lock and key or sharing data without proper data governance processes to protect individual customer privacy.

The demands of the digital economy meant the first of those options long ago ceased to make commercial sense. Citizens’ growing concerns about how their data is being used (and politicians’ efforts to address those concerns) means that organizations can no longer continue with the second option.

Privacy-preserving matching technologies

Over the years, many of the best minds in the tech industry have attempted to square the utility-privacy circle through privacy-preserving matching technologies (i.e. technologies that protect privacy without excessively downgrading data utility).

Some of the most popular methods used to protect PII while still attempting to gain deeper insights from comparing data sets have been:

Hashing – it’s cheap and easy for organizations to mask aspects of PII by, for instance, transforming customer names into numbers. The problem is that it’s also cheap and easy for hackers to reverse the process and convert numbers back into customer names. Combining hashing with salting makes things much more difficult for hackers. But they can still reidentify the data in certain circumstances.

Encryption – This is a step up from simple hashing but comes at the cost of key management complexity. It can also result in plain text PII being revealed.

Homomorphic encryption – The main pro with this technology is that plain text PII doesn’t run the risk of being revealed. The significant con is that the utility of the data is reduced.

Centralised token stores – It’s hard to see how e-commerce could have taken off without tokenization and it undoubtedly goes a long way to resolving the utility-privacy conundrum. But centralized token stores very centralization of data means that if things do go wrong, they can go very wrong.

Distributed ledger technologies – By definition, these remove the risks involved with centralizing data. The catch is that there are no governance controls once the organizations doing the data matching agree to share their data sets.

Senate matching: bringing it all together

After co-founding Data Republic four years ago, Danny Gilligan and the team experimented with a range of privacy-preserving matching technologies. Then they decided to attempt to combine the best bits of all of them. “The first revelation was that centralizing data wasn’t a good idea,” he says. “The second epiphany was that many problems would be avoided if PII stayed within organizational boundaries and only data that had been comprehensively deidentified moved around. That was the genesis of Senate Matching, Data Republic’s Private-by-Design platform, which launched in 2018.”

As explained here, data fed into the Senate Matching platform is hashed, salted, sliced, encrypted and tokenized. Then, just to be on the safe side, the token slices are distributed across a network of nodes, called Matcher Nodes, to make it near impossible for any party – including Data Republic – to reconstruct the original customer data.

Incredibly, all this occurs without damaging the utility of the data. Rather than getting a general match result without detail, data scientists gain secure access to matched tables in a quarantined analytics workspace where they can see exactly which fields are matched, while PII is protected.

“It’s very advanced, incredibly original technology,” Gilligan says. “Technology that minimizes systemic risk while maximizing the potential for innovation.”

Real-world use cases

The possibilities of a platform that allows organizations to data match without having to worry about taking a reputational or financial hit are near endless.

The 140 American, Australian and Singaporean organizations that have been using Data Republic’s data-sharing technologies in recent years have most commonly done so to unlock insights, perform high-value analytics, exploit joint-marketing opportunities, accelerate innovation and realise the value of the organizational data sharing. Those Data Republic customers that have been using Senate Matching in recent months have chiefly harnessed the platform for:

If you’re interested in learning more about what Senate Matching can achieve for your organization, you can find more information here.