Back to Blog Lobby

Advancing Privacy-Preserving AI: Duality’s Secure, Federated Learning with NVIDIA & Google Cloud

Duality 4.3: Leading the Next Era of Secure Data Collaboration

Data collaboration is critical for AI innovation, but maintaining privacy and security is a persistent challenge, especially in regulated industries like healthcare and finance. Duality Technologies v4.3 addresses this by integrating NVIDIA FLARE, Google Cloud Confidential Space, and in-house privacy-enhancing capabilities into a unified, easy-to-deploy, federated learning platform.

Duality simplifies deployment across multiple organizations with an intuitive installation process and seamless infrastructure integration. The platform facilitates secure data ingestion and preprocessing, ensuring proper alignment of disparate datasets. It also provides robust project and participant management, defining roles and responsibilities for seamless execution. Additionally, Duality enforces governance and privacy policies, including differential privacy mechanisms to protect sensitive insights. Finally, automated encryption and attestation ensure all computations are conducted securely within Trusted Execution Environments (TEEs), eliminating architectural complexities.

This powerful combination ensures that organizations can collaborate on AI models without exposing their sensitive data to each other, to the cloud provider or to Duality. NVIDIA FLARE, a domain-agnostic, open-source, extensible Python SDK, enables researchers and data scientists to adapt existing ML/DL workflows to a federated paradigm. It provides platform developers with the tools to build secure, privacy-preserving solutions for distributed multi-party collaboration (NVIDIA FLARE, GitHub). 

Google Cloud Confidential Space provides TEEs to secure model aggregation. Duality’s platform does more than just integrate existing technologies—it actively enhances federated learning workflows by adding essential security, automation, and governance capabilities to offer an end-to-end secure collaboration framework.


The Problems Federated Learning Solve

Organizations across industries need to analyze and learn from distributed data sources without compromising privacy. Federated Learning (FL) and Federated Analytics allow multiple parties to train AI models or run statistics collaboratively without sharing raw data. Duality’s platform is already deployed and operational in these critical use cases, enabling real-world privacy-preserving collaboration.

For example:

  • Healthcare: Hospitals and research institutions are using Duality’s platform to jointly develop AI models for early cancer detection while ensuring patient data never leaves their premises.
  • Healthcare: Institutions are analyzing aggregated statistics for childhood cancer research without transferring sensitive medical data between countries, with Duality providing a secure and compliant solution.
  • Finance: Banks can collaborate on fraud-detection models without exposing customer transaction records, leveraging Duality’s technology.
  • Government: Ministries can obtain aggregated statistics across municipalities or agencies while ensuring data remains within its jurisdiction, thanks to Duality’s privacy-preserving infrastructure.

FL provides an ideal privacy-preserving mechanism, but traditional approaches still have vulnerabilities.


Drawbacks of Standard Federated Learning and Analytics

While FL keeps raw data decentralized, it does not inherently secure the computed intermediate results. The locally-trained model weights, when aggregated, are often exposed in plaintext. This can lead to unintended information leakage, as adversaries can statistically infer underlying data patterns.

Recent academic research has demonstrated that local-model updates can be analyzed to reconstruct sensitive training data, presenting a significant privacy risk. Studies have shown that image reconstruction techniques can be used to recover original training data (NeurIPS 2023). Furthermore, research highlights how an aggregator can infer local training data from FL participants (arXiv 2021). Additionally, statistical results have been used to re-identify data, demonstrating potential vulnerabilities (Latanya Sweeney).

To mitigate this, federated learning must be reinforced with privacy-preserving aggregation methods.


Secure Aggregation Using Trusted Execution Environments

To address these vulnerabilities, Duality has integrated Google Cloud Confidential Space into its platform. Secure Aggregation ensures that AI model updates remain encrypted throughout the federated learning process, preventing unauthorized access or inference attacks.

With Google Cloud Confidential Space, all computations occur within a TEE that is instantiated from the Confidential Space image. This hardened image does not allow any access unless explicitly enabled, ensuring a high-security execution environment.

With Google Cloud Confidential Space, all computations occur within a TEE, meaning that:

  • Even Duality, as the platform provider, or the account owner, or Google Cloud, cannot access the raw model updates.
  • No external party can tamper with or extract sensitive information during the aggregation process.
  • Institutions can collaborate with full confidence that their proprietary data and model insights remain protected.

This integration creates a truly secure federated learning ecosystem that combines decentralized training with end-to-end encryption.


Duality’s Holistic Approach to Data Security and Privacy

Federated learning, by design, ensures data used for training remains at the source. By adding Google Cloud Confidential Space to protect the intermediate model weights, we create a fully secured, end-to-end flow where:

  1. Data remains decentralized and is never shared or moved.
  2. Model updates are encrypted in transit and at rest.
  3. Secure enclaves (“aka TEEs”) guarantee the integrity of aggregation and also ensure that the machine itself cannot be tampered with.

Beyond these technical security measures, Duality provides a comprehensive suite of capabilities to simplify and secure federated learning deployments:

  • Deployment & Installation: Effortless deployment across multiple organizations with easy installation and seamless integration of federated learning and trusted execution environment within existing infrastructure.
  • Project Management: End-to-end workflow coordination for multi-party collaborations, including managing project participants and defining roles and responsibilities to ensure smooth execution.
  • Data Digestion, Alignment, and Preprocessing: Connecting to various data resources, including object stores and databases. Ensuring proper alignment and preprocessing to harmonize disparate datasets before federated training, addressing inconsistencies, missing values, and formatting differences.
  • Automated Encryption & Attestation: Ensuring all computations are conducted in a verifiable, secure environment while automating the engineering and architectural challenges of working with TEEs. This includes simplifying workload isolation, securing model execution, and streamlining cryptographic attestation for seamless deployment.
  • Governance & Policies: Defining and enforcing permissions on who can run computations on which datasets.
  • Differential Privacy & Privacy Rules: Implementing structured thresholds and noise to protect sensitive information in results.

This holistic approach ensures organizations can leverage AI while maintaining compliance with stringent regulatory frameworks.


Secure Federated Learning with Confidential Space Computation Flow

Duality’s secure, federated learning flow ensures privacy and efficiency across multiple organizations by following these key steps:

  1. Federated Aggregator distributes the data preprocessing and training parameters to participants
  2. Data is digested and preprocessed
  3. Preprocessed data is used for training
  4. Local intermediate results are encrypted
  5. Encrypted local training weights are sent to the Federated Aggregator for aggregation.
  6. Federated Aggregator decrypts inside the TEE, aggregates the intermediate results to get the new model weights.
  7. The process repeats until model convergence for training. For analytics, this step is not required as it focuses on aggregated insights rather than iterative model refinement.
  8. New trained model is sent to any selected party

Use Case: Oncology Research at Dana-Farber Cancer Institute

Cancer research relies on large-scale digital pathology data from multiple institutions. However, regulatory and privacy concerns make data sharing a slow and complex process. Pathology images, classified as Protected Health Information (PHI), cannot be freely exchanged between organizations, creating barriers to collaboration.

To overcome this, Dana-Farber Cancer Institute partnered with Duality Technologies to implement a secure federated learning framework. Using Duality’s platform:

  • Hospitals could collaboratively train AI models on digital pathology images without transferring patient data.
  • Model training occurred locally, with only encrypted weights transmitted to an aggregation server.
  • Google Cloud Confidential Space ensured model aggregation was conducted securely, preventing data exposure.

The results demonstrated that secure federated learning enabled organizations to train a high-quality model, specifically comparable to centralized AI training, while ensuring privacy compliance. Additionally, it surpassed the performance of single-site training while allowing data to remain in place, eliminating the need to transfer large-scale datasets, highlighting its advantage in collaborative model development.


Summary: Protecting the Data and the Model

Duality’s integration of NVIDIA FLARE and Google Cloud Confidential Space establishes a new benchmark for secure AI collaboration. By combining federated learning with PETs such as TEEs, organizations can:

  • Build AI models collaboratively without exposing sensitive data.
  • Ensure regulatory compliance with end-to-end security.
  • Achieve high-performance AI without compromising privacy.

This groundbreaking approach paves the way for broader adoption of federated learning across industries, enabling secure data collaboration at scale. Duality’s v4.3 platform supports various use cases across multiple vendors, from improving AI-driven cancer detection models in healthcare to enhancing fraud detection in finance, our technology is already proving its value.

By leveraging NVIDIA FLARE, Duality ensures that organizations can securely train and deploy AI models across diverse environments while maintaining full compliance and data protection.

Sign up for more knowledge and insights from our experts