The Data Challenge: Shifting from DevSecOps to MLSecOps

Derek Wood

August 08, 2024 6 min read

DevSecOps has been a thing in software security for over a decade, emphasizing the integration of security from the design phase onwards. These foundational principles continue to serve as the basis for MLSecOps and AISecOps, however, they take it a step further.

What is DevSecOps?

DevSecOps stands for development, security, and operations. It is a methodology that integrates security into every phase of the software development lifecycle, from initial design through integration, testing, delivery, and deployment. The goal is to make security a shared responsibility among all teams involved in the development process, rather than an afterthought or a separate phase at the end of the cycle.

What is MLSecOps?

MLSecOps, short for Machine Learning Security Operations, is an extension of MLOps that specifically focuses on integrating security practices into the machine learning (ML) and AI development process. This approach addresses the security challenges associated with ML systems, such as data privacy, model integrity, adversarial attacks, and the protection of sensitive information.

The Shift From DevSecOps to the MLSecOps Framework

MLSecOps practices combine the principles of DevSecOps with the specific needs of ML systems, ensuring that security is integrated throughout the entire ML lifecycle—from data acquisition and model training to deployment and monitoring.

DevSecOps has traditionally focused on scanning for developers’ secrets, without delving deeply into data concerns. After all, software applications can be built effectively with sample data. Artificial Intelligence and Machine Learning systems, however, require data inputs for development, training, and customization. This transition to MLSecOps requires a major shift in the tools, processes, and people involved to address the 500 lb gorilla in the room—data.

Why the Focus on Data?

In MLSecOps, data is the cornerstone of the entire operation. ML and AI models are inherently data-driven, relying on vast amounts of high-quality data for development, training, and customization. This dependency on data introduces several challenges:

Data Quality and Integrity: The saying “garbage in, garbage out” is particularly relevant in ML. Poor-quality data leads to inaccurate models, causing significant downstream effects on AI applications.
Data Acquisition and Management: Teams must collaborate with third parties to acquire sufficient, high-quality data, increasing the risk of data breaches and other security issues. Without a steady stream of data, AI strategies can be slowed or even halted.
Data Governance: Understanding which datasets are being used, their origins, ownership, and contents is critical for maintaining transparency and ensuring compliance with regulatory requirements. Teams need quick answers to tough questions like “Why did the model return this result?” and “Which datasets were used to train this model?” to create transparency and uphold standards.

Tools and Technologies

The transition to MLSecOps demands innovative tools and technologies to secure both data and models:

Confidential Computing Environments: Also known as Trusted Execution Environments (TEEs), these secure areas process sensitive data and train models while protecting intellectual property and ensuring data privacy. However, these environments require significant development and expertise to operationalize.

As found in IMDA’s recent case study involving the use of TEEs.

Secure Collaborative AI Solutions: Platforms like Duality’s Secure Collaborative AI integrate TEEs and governance frameworks, enabling MLSecOps teams to manage data security, privacy, and model transparency effectively. Duality simplifies the use of TEEs and incorporates governance, allowing MLSecOps teams to easily address transparency and explainability questions while safeguarding model intellectual property.

Case Study: Confidential Computing and Federated Learning for AI in Oncology

People and Processes

The shift to MLSecOps also impacts the roles and processes within an organization:

Cross-Disciplinary Collaboration: Securing AI requires the involvement of security and IT resources, data scientists, researchers, business teams, and privacy experts. This convergence of roles can be challenging due to differing priorities and methodologies. For instance, DevSecOps teams are familiar with working alongside engineering teams, who understand security principles and processes for validation, checks, and deployment. However, collaborating with data scientists and researchers, who may lack strong privacy or security knowledge and don’t typically follow structured methodologies, requires a different approach.
Security by Design: Incorporating privacy, security, and model IP protections from the outset of any AI strategy is essential. This proactive approach helps prevent disruptions in data operations, business processes, and go-to-market efforts.

Additional Considerations Transitioning to MLSecOps

Adversarial Machine Learning and Security Risks

Adversarial machine learning (AML) poses a significant threat to AI systems and their security. Attackers can manipulate input data to deceive ML models, leading to incorrect predictions and potentially harmful outcomes. To counter these risks, MLSecOps must include defenses against adversarial attacks, such as:

Adversarial Robustness Toolbox: Tools designed to test and enhance the robustness of ML models against adversarial threats.
Regular Security Testing: Implementing continuous security testing and threat modeling to identify vulnerabilities and mitigate potential risks.

Supply Chain Vulnerabilities

The ML supply chain is another area of concern. Vulnerabilities in data storage, software components, and communication networks can be exploited by malicious code. To secure the ML supply chain, organizations should:

Conduct Supply Chain Vulnerability Assessments: Regularly evaluate the security of all components within the ML supply chain.
Implement Access Controls: Ensure that only authorized personnel have access to sensitive data and systems.
Monitor and Update Systems: Continuously monitor and update systems to stay ahead of emerging cyber threats.

Best Practices for MLSecOps

Implementing MLSecOps effectively requires organizations to consider several best practices. These ensure the security, reliability, and compliance of AI and machine learning systems, paving the way for a future where AI can be safely and effectively integrated into all aspects of business and society.

Organizations should consider the following best practices:

Shift Left: Integrate security checks and balances early in the development process to mitigate potential risks before they escalate.
Comprehensive Data Management: Maintain a clear understanding of data sources, ownership, and usage to ensure data integrity and compliance.
Continuous Monitoring and Auditing: Regularly audit the ML environment and conduct vulnerability assessments to identify and address potential threats.
Educational Resources: Provide training and educational resources to stakeholders to build a safer AI-powered world.
Industry Standards: Adopt industry standards and frameworks to guide the implementation of effective MLSecOps practices.
Collaboration with Experts: Engage with AI security experts, such as those from the MLSecOps podcast or the Linux Foundation, to gain valuable insights and support.

Machine Learning Security Operations with Duality Technology

MLSecOps is crucial for addressing AI’s growth potential by tackling the unique data challenges inherent in ML systems. By borrowing lessons from DevSecOps and integrating data operations, organizations can ensure the security, privacy, and integrity of their AI models. Additionally, protecting model IP must be established to support customer-facing AI services; most of which will require customization of client data.

Duality’s Secure Collaborative AI platform exemplifies how MLSecOps can be operationalized, providing a comprehensive solution that integrates privacy, security, and governance by design.

Contact Duality today to see how we can make MLSecOps work for you.

Derek Wood

Platform Overview

Platform Overview

Duality Query

Duality AI

Technology

Technology

Open Source

Glossary

Government

Government Overview

Zero Footprint Investigations & Intelligence

Cross Domain Zero Footprint Investigations & Intelligence

Cross Departments Analytics

Healthcare

Healthcare Overview

GWAS

Oncology Research

Real World Evidence

Cross Border Health Analytics

Financial Institutions

Financial Services Overview

Fraud Prevention

Risk Scoring

Anti-Money Laundering

Trade Financing

KYC Compliance

Trial AI Models

Marketing

Marketing Overview

Targeted Offers

Data Service Providers

Data Service Provider Overview

Trial AI Models

Data Monetization

Customize GenAI Models

Manufacturing

Manufacturing Overview

Predictive Maintenance

Supply Chain Management

Insurance

Insurance Overview

Underwriting & Pricing

AI Implementation

Cross Border Insurance Operations

Claims Processing

Regulatory Reinsurance and Reporting

Duality Collaboration Hub

AWS

Google

Azure

Deloitte

Carahsoft

Blackwood

Oracle

Intel

IBM

DARPA

LSEG

NVIDIA

Blog

Resource Hub

Videos

Documentation

About Duality

Leadership

Events

Careers

News

Contact Us