Summary: The ability to collaborate and share insights with anyone, anywhere, in any environment, and with any type of data – while satisfying compliance by default – is a force multiplier for any data strategy. In fact, acquiring the volume and variety of data to use in support of data-driven initiatives is one of the largest challenges facing data teams today, especially when developing and training ML and AI models. Such challenges are precisely what secure data collaboration solves.
“Any sufficiently advanced technology is indistinguishable from magic.”
– Arthur C. Clarke
Secure Data Collaboration means being able to generate valuable insights from data without needing nor having access to the underlying data. Admittedly, it sounds like a paradox. How can anyone analyze data without access to it? For years through today, it is understood that using data means having access to it, thereby increasing data risk and attack surfaces. Zero-trust, insider threat prevention, just-in-time access management, and third-party risk management all have to do with minimizing the necessary access to data for users to do their jobs. Secure data collaboration uses various leading-edge technologies to break the relationship between data risk and data use, unlocking doors to valuable data sharing and collaboration opportunities. It removes the need for analyzers to access raw data to run queries, analytics, and models to support data-driven growth and innovation initiatives.
This is a force multiplier for data-driven organizations. It changes the fundamental understanding we have of data use and data risk. To date, organizations have been limited to addressing these points of exposure and risk through a combination of process, data scrubbing, and legal liability protections. Now, there’s a technical solution from which every organization can benefit in at least one of the following ways:
Read on to discover what each of these means to the modern data-driven enterprise.
Duality has designed our platform around the idea that the underlying technology should satisfy privacy, security, and governance requirements by default. This begins with the platform being built upon leading-edge technologies that remove the need to expose data to those that must use it for analysis and data-driven decisions. Thereafter, a built-in collaboration layer includes governance tools for data owners to define which analyzers can run which computations, how often, and have the necessary reporting and audit trails to prove it.
Why does this matter? Let’s take the example of Atlassian’s Mar 2023, announcement of a major data breach, which was initiated through a 3rd party data relationship. The breach occurred through a single analyst whose account was accessed, allowing the cybercriminal group to download, sell, and publish sensitive Atlassian data.
For the 3rd party, this could be addressed with improved security training plus improved insider risk threat detection and mitigation. But still, the analyst required access to this data set to do their job, so when their account was breached, that was it. If the 3rd party utilized a secure data collaboration service, the breach would have been extremely limited in impact. The infiltrators would have been limited to strictly governed queries or analytics with access to results only — far less impactful for the hackers, and far better levels of trust to be offered to clients.
The breach could have also been severely lessened had Atlassian utilized a secure data collaboration platform for their 3rd party data arrangements. Given to the built-in governance controls, any unplanned queries or analytics would have been prevented and even improved the time-to-detection. With their current form of perpetual access, the only option organizations have is to employ complex anomaly detection algorithms or just-in-time-access-management
flows that may have noticed unusual activity. Either of these options can be difficult to scale. With secure data collaboration, any breach of the 3rd party’s users, network, or services would be relegated to governed queries and analytics with no ability to steal or leak the raw data as occurred in this case.
Given the choice, any CIO, CISO, legal, or privacy leader would choose the path that removes the need for additional points of data exposure. The question becomes, how important is reducing attack surfaces and eliminating an entire category of risk to your organization? Wouldn’t it be easier to gain support if a change leads to revenue growth?
According to Gartner, less than half of data and analytics leaders (44%) report that their team is effective in providing attributable value to their organization. For organizations handling sensitive data, it’s common to have a significant resource effort leading to any such 3rd party data arrangement, whether sharing or collaborating. This is especially true in the context of bridging data silos or attempting to work across jurisdictions where data protection methods like deidentification and anonymization come into play, which is often followed by a costly and risky migration of data across borders. When it comes to data-driven initiatives, complex processes, and data movement impact efforts by:
By defaulting to a secure data collaboration platform, all 3 issues are addressed. By not needing to remove key points of data context to satisfy deidentification, the results of queries and analyses will take advantage of the full quality of the source data set. This means higher quality insights and therefore a better ROI on the entire effort. For a real-life example, read our case study about how we are accelerating breakthroughs in healthcare studies.
By removing lengthy negotiations of what data should be kept versus removed, teams can focus on doing the work. This reduces the costs of such initiatives as well as the time to insights, both great for showing an ROI to the business.
This is the most impactful category of benefits. We’re talking about collaborating with anyone, anywhere, in any environment, with any data type. This is the true force multiplier that no data-driven organization can afford to miss. It provides a clear path to truly borderless data, expanded and simplified data monetization, and enables data-centric AI/ML model development. However, these “couldn’t be done before” items are perhaps the most complex to comprehend. After all, if it was simply a non-option to collaborate across borders or to monetize data with 3rd parties due to prohibitive processes or privacy and compliance regulations, the consideration leaves people’s minds. Instead, those involved are likely to accept these limits as permanent rules of law, beginning with a search for workarounds if they don’t retract from such markets entirely.
Borderless data is a major challenge and opportunity for many organizations. Take cross-border collaboration in oncology research. In 2023, Duality partnered with Tel Aviv Sourasky Medical Center, Israel’s leading multidisciplinary healthcare institution, to facilitate collaborative oncological real-world evidence (RWE) studies while protecting private health information (PHI). From their patient care and research, they have developed a treasure trove of data valuable to any of the thousands of cancer institutes, pharmaceuticals, or health service organizations found throughout the world. Without secure data collaboration that satisfies complete anonymization (as required by numerous privacy laws), collaborative efforts were far more expensive, limited, and less common. Essentially, the global collaboration required to solve global health issues just wasn’t feasible at a rate that yielded much progress. They needed a better way.
They found their way with secure data collaboration and a formalized partnership with Duality. By removing the need for direct access to data to run analytics and queries, the opportunities for sharing, collaboration, and saving lives, blossomed.
New technology can sometimes be simple to understand – it saves time or money. For others, it can take a moment to rewire our understanding of the world. Secure data collaboration is the latter. To paraphrase the most common response heard from various security and data leaders, “That’s huge if true. I just don’t know if I believe it. How does it work?”
Duality is able to bring such a solution to market by virtue of our de facto status as global leaders in advanced cryptographic technologies such as Fully Homomorphic Encryption (FHE) and Secure Federated Learning (FL), to name two. From there, we’ve layered in a fully integrated computation engine and collaboration management layer to provide the required governance and data management capabilities critical to any data initiative.
So, here we are, with proof of viability and availability to try it yourself today. In years forward, we’ll look at the ways we attempted to secure the use of sensitive data much like we do at blood leeching in medicine, or more recently, how SSLs weren’t standard for web traffic until the late 2010s. It’s real. It’s here. It’s ready.
Even with such a platform ready for market, we’re still coming out of the “wild wild west” days of data analytics, BI, and data partnership arrangements, wherein the increase in risk is both accepted and expected in exchange for business growth. Thanks to solutions like Duality’s, we no longer have to make that trade and instead have a force multiplier that will improve the impact of data analytics and machine learning for any department in any organization.
Join our upcoming webinar for more information about how such a platform can benefit your organization.
Interested in technical details? Read our whitepaper.