Data clean rooms are increasingly becoming a necessity for companies that are doing a lot of data-driven work – but they come with their own sets of advantages and disadvantages. Firms that share their own first-party data with other parties, such as outsourcing partners and research institutions, can use data clean rooms to ensure that their own data is not exposed during collaboration. Data clean rooms also allow firms to be more in compliance with privacy regulations because they allow the firm to share data with third parties without actually sharing the underlying information. In this blog, we will explore the pros and cons of DCRs.
Advantages of Data Clean Rooms
Some of the most notable benefits of using a data clean room include key issues of data privacy and data governance.
- Compliance with privacy requirements – Data clean rooms help businesses comply with the basic privacy regulations, such as those laid out in the General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA). For companies in the advertising space, data clean rooms are a privacy-friendly solution for analyzing audiences, targeting ads, and measuring performance, offering a means to gain insights from shared data with the use of soon-to-be-obsolete third party cookies.
- Data owners maintain control – Data clean rooms facilitate collaboration by sharing data anonymously. Even though user-level data is added to a data clean room, it’s not exposed to other parties; in this sense, the owners do not relinquish control of their data. For example, data clean rooms can help banks fix the age-old problem of outdated addresses. Banks and postal companies can match their customer addresses so that the banks can update their mailing lists without disclosing their customer data to the postal company.
- Versatile and fast analysis – In data clean rooms, data can be analyzed with popular programming languages like Python and SQL. Data clean rooms support advanced analysis including machine learning, and generally deliver fast results. For advertisers, some data clean rooms deliver a holistic view of the performance of campaigns across various distribution channels. For example, for media campaigns, digital clean rooms can provide a variety of analyses: consumer journey, audience overlap, and reach and frequency, as well as mix modeling and scenario planning using AI and machine learning.
In short, DCRs address the key pain points of data teams and privacy teams alike by securing privacy and determining governance on the one hand, while allowing for fast analysis on the other. They are not perfect, however.
Disadvantages of Data Clean Rooms
Data Clean Rooms, while popular with data scientists and security professionals at the moment, are not a panacea for the privacy-utility conflict with data.
Common challenges include:
- Receiving less accurate data outputs – Data clean rooms use aggregated data in their reports, which is less accurate than ID-based data. Methods like anonymization that are applied in data clean rooms to preserve privacy end up reducing the granularity of data stored and analyzed, resulting in less accurate insights, as discussed in our post on Data Anonymization Techniques. Anonymized data no longer supports targeted offers, for example, as insights cannot be connected to an individual. In some cases, the privacy preserving techniques also scramble the relationships between data points. The lost relationships impede artificial intelligence or data science activity.
Before uploading data to a data clean room, it has to be unified into one format for use. In a collaborative process, match rates are critical–that is, the percentage of individuals from one data set who can be successfully matched with those of another set. Sub-par match rates (as low as 39%) reduce the value of data clean rooms in the advertising space.
It is also worth noting that at the moment, a relatively small scale of data is being used in data clean rooms, which means that companies requiring scale must sometimes use modeling to bridge that gap.
- Data is not interoperable – Many data clean rooms are walled gardens and work only for a specific platform (e.g., Google or Facebook). This means advertisers are forced to manually combine results from different data clean rooms.
Additionally, data clean rooms are new enough that universal standards haven’t yet been adopted for their implementation. That means that platforms and advertisers may be trying to pool data that exists in multiple formats, and the prep work that goes into aggregating those different formats can be time consuming.
- Parties hesitate to share data – To generate insights, advertisers have to hand over their valuable first-party data. In the worst case scenario, a data breach could lead to hefty fines, not to mention reputation and clients loss. Willingness to share customer data varies: One company might be willing to add all of their customer data into a data clean room, while another might only add half. The more reluctant parties are to share their data, the less effective the data clean room will be in the functions it can provide for its clients.
- Privacy risk remains – Some data clean rooms are manually managed, making them vulnerable to human errors, such as granting access to people who shouldn’t have it, incorrectly formulating queries and exchanging data in an unsecured environment. More generally, there is still a risk of a data breach or leak even with the various privacy preserving techniques employed in data clean rooms.
Weighing the Pros and Cons
For data-driven enterprises, data clean rooms have certain advantages, including adherence to privacy regulations and restrictions, the ability to collaborate on data without revealing their first-party data with other parties, and a range of data analysis possibilities. However, they also have disadvantages, including less accurate data outputs, requiring parties to share data in a way they might be reluctant to do, and ongoing privacy risks.
According to Gartner, “by 2026, 80% of organizations pursuing a 360-degree view of the customer will abandon it because it doesn’t adhere to data privacy regulations, relies on obsolete data collection methods and obliterates customer trust” (“A Guide to What Is—and Isn’t—a Customer Data Platform,” 2 March 2022). With this perspective in mind, forward-looking data-driven enterprises will explore options that can continue to meet their needs and evolving privacy regulations and expectations. This is likely to be a holistic privacy-preserving data collaboration platform that allows flexible selection and combination of multiple privacy enhancing technologies (PETs) as needed across the organization and data sources.
To further understand DCRs, how they interact with PETs, and what considerations to make when choosing a DCR, read our eBook, “The Privacy Professional’s Guide to Data Clean Rooms.”