It was great to attend the Gartner® Data and Analytics Summit 2022 in Orlando and get the chance to meet so many customers, vendors and Gartner® analysts face to face. The show was excellent, well organized and brimming with insightful and practical content. Between sessions we got the chance to meet with innovative CDOs from some of the largest data driven enterprises and learn about their strategies to overcome data silos. Much of the buzz was about how data continues to fuel our global economy, data footprint, data storage and data cleaning, and learning about synthetic data; but one topic was eerily absent – data collaboration. There seems to be a misconception that synthetic data is the panacea cure-all for every secure data collaboration need, suggesting that much more education about Privacy Enhancing Technologies is needed.
How are CDOs going to be able to deliver on their promise to remove data silos without a solution to allow sensitive data to be shared and collaborated on in a secure, privacy-preserving manner?
Of all the data privacy tools, Synthetic Data was unquestionably the most frequently discussed. The logic for many of the Fortune 500 companies present was simple: we need a scalable way to train models on relevant data, without putting the real data at risk. Synthetic data generation tools create a new, artificial data set that has the same structure and approximates real values of the sensitive data and serves it up for data scientists to train models that can, when ready, be applied to the real data at scale.
But even within this approach, there are a few caveats that necessitate a more comprehensive platform to truly deliver on the privacy-enhancing promise:
The buzz about synthetic data is understandable – it’s a fantastic tool for a highly relevant purpose. But for me, the biggest insight from the Summit came from what was conspicuously absent from the conversation: breaking down rigid data silos. It is commonly accepted and frequently stated that data silos must be eliminated to extract novel insights and drive better business decisions. When the data is not sensitive and can be exposed freely, skirting these silos is a straightforward access management task. But when much of an organization’s most valuable data is also extremely sensitive, there are so many insights left unnoticed and un- or under-utilized. Financial, research, genomic, patient, and customer data are just a few examples of sensitive data that require the utmost care to protect and keep private. Yet much of the data and analytics community is under the impression that collaborating on sensitive data, while desirable, is impossible to do in a way that amply preserves privacy.
The field of privacy enhancing technologies (PETs) holds the keys to finally breaking down internal and external data silos to unlock the power of data collaboration, but to many it still seems like ‘science-fiction’ or ‘too good to be true’. From personal experience working at top data and analytics firms, when I first heard about the Duality Platform’s ability to allow data to be used without being decrypted, I couldn’t fathom how it could be done. Now, having joined this team of pioneers, I’m eager to help more enterprises put this technology to work to mine powerful new insights from their data goldmines.