Data collaboration between multiple parties is revolutionary, offering almost endless potential for businesses and societies to innovate and improve. Yet it is also fundamental, increasingly a necessity for companies to unlock the insights they need to serve their customers effectively and maximize revenue. Collaboration often involves sensitive data, including personally identifiable information (PII). As a result, perhaps the greatest challenge in data collaboration is keeping that data secure and complying with regulations, such as the well-known General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA).
The advertising industry currently faces specific challenges around data collaboration. In particular, five state-level laws in California, Virginia, Colorado, Connecticut, and Utah now impose new restrictions on the collection and use of personal data. The deprecation of third party cookies is further altering the advertising landscape, with Firefox Mozilla and Apple Safari having already blocked third party cookies, and Google Chrome announcing plans to do so by 2024. All these changes limit companies’ ability to collect, use, and share data.
Businesses have responded to this changing landscape by investing in privacy-preserving technology. In this context, data clean rooms have emerged as a prominent collaboration tool. A frequently cited projection from Gartner is that 80% of advertisers with media budgets of $1 billion or more will be using data clean rooms by 2023. In this blog, we examine data clean rooms and their advantages and drawbacks.
Data clean rooms are intended to be secure environments that allow multiple parties to collaborate using proprietary data while complying with privacy regulations. The goal is for parties to be able to use data clean rooms without concern over the potential risks to data that may be considered sensitive, such as PII, device IDs, and other geographic, behavioral, audience, or contextual data. The data that is shared in the clean room is anonymized and analyzed. In an advertising context, data clean rooms can replace third party cookies as a way for advertisers to match their first party data with that of other companies and come away with useful analytics. Because data clean rooms are designed to protect privacy, all outputs are based on aggregated data.
For example, collaboration between the buyers and sellers of digital advertising can uncover insights into audience and behavior, helping advertisers evaluate and refine their campaigns.
The output from a data clean room can be compared to standing outside a house party. You can hear the beat of the music and the rise and fall of conversations. You can see that people are dancing in time to the music. But the window you’re looking through is frosted, so you can’t make out individual figures.
Data has three basic states: at rest, in transit, and in use. Sensitive data such as PII is most vulnerable while in use (being processed, analyzed, or manipulated). This is the challenge data clean rooms aim to help address.
Data clean rooms generally offer a variety of privacy features to protect data, including the following:
Companies often use Data Clean Rooms in combination with other technologies for managing customer data and ensuring privacy. Following are a few of the most common examples.
According to an IAB-commissioned Ipsos report, 84% of data clean room users in the digital advertising space are also using CDPs. A CDP is a marketing software application that unifies a company’s customer data from all channels. CDPs can guide the timing and targeting of messages and engagement activities of customers, and support analysis of behavior at an individual level. Data clean rooms can be used within CDPs to support, for example, advertising measurement.
Identity resolution (IDR) aims to link records across one or more datasets (usually from multiple parties) that refer to the same individual. IDR providers use match keys, such as an email, cookie, or IP address, to identify when two records refer to the same individual or household.
Though not strictly defined as privacy enhancing technologies (PETs) themselves, data clean rooms are often used in conjunction with PETs or with other privacy-preserving technologies.
Confidential computing is a hardware-based technology designed to protect data in use. While this approach to enhancing data security in the cloud is cutting edge and still developing, confidential computing is viewed as having great potential when paired with data clean rooms to protect data while it is being processed or analyzed.
Data clean rooms are not a homogenous offering. Following are a few basic categorizations for data clean rooms.
Walled gardens are perhaps the most familiar type of data clean room. Google, Amazon, Facebook all provide hashed and aggregated data to companies that use their advertising platforms in order to evaluate advertising performance. These clean rooms are “walled” and do not provide a cross-platform view.
Another way of cross-sectioning data clean room offerings is by distinguishing between self service and managed service.
Self-service clean rooms provide access to the technology platform only, and do not offer support in collaborating with partners. This option is attractive for companies that need more granular data about their audience and how that fits with data-sharing partners, and have the resources to handle data partnerships at scale and assume liability for data mishaps.
With a managed service offering, companies upload data and the clean room provider manages all the data partnerships inside and outside their platform. The considerations are the opposite of self service: The company is not responsible for coordinating data sharing or legal liability, yet they generally cannot access more granular data.
For instance, a managed service clean room might relay what percentage of the data shared with a partner audience is aligned, which can help with media buying decisions; but it won’t say which specific hashed emails did not align.
Data clean rooms are critical to the data analytics process of any industry. The importance of a clean room cannot be understated, as it is the key to the data analytics process. A clean room will often have staff working around the clock ensuring that data is correct, secure, and private. From historical stock prices to medical records, a clean room is essential for any businesses handling large amounts of sensitive information.
We explored the different types of data clean rooms – and what they do and don’t do for businesses. Stay tuned for the next post in our series, “Data Clean Rooms: Advantages and Disadvantages.”
Want to jump right into a deep dive of DCRs and considerations when choosing a DCR? Check out our eBook, “The Privacy Professional’s Guide to Data Clean Rooms.”