Privacy-enhancing technology (PET) is the privacy control consisting of information and communication technology (ICT) measures, products or services that protect privacy by eliminating or reducing personally identifiable information (PII) or by preventing unnecessary and/or undesired processing of PII, all without losing the functionality of the ICT system.
Examples of PETs include, but are not limited to, anonymization and pseudonymization tools that eliminate, reduce, mask or de-identify PII or that prevent unnecessary, unauthorized and/or undesirable processing of PII.
Millions of data are created every single day. This is not just by humans, but machines using modern-day technologies such as artificial intelligence (AI) and machine learning (ML). Privacy breaches can impact the reputation of the business as well as create harm to the data subject. It becomes necessary for businesses to protect the vast amount of data generated. The objective of PETs is to protect personal data.
PETs are important due to the applicability of data protection laws such as India’s Digital Personal Data Protection Act (DPDPA), EU’s General Data Protection Regulation (GDPR), the California Consumer Privacy Act (CCPA), etc., that require businesses to protect personal data or face heavy penalties. These mandates do not explicitly mention PETs but are understood to include similar technologies. PETs are vital in the context of fulfilling customer contractual privacy obligations.
The UK Information Commissioner’s Office PETs guidance classifies PETs that can help achieve data protection compliance, including data protection by design and default, such as:
- PETs that derive or generate data that reduce or removes the identifiability of individuals to help fulfil the data minimization principle. Examples include differential privacy and synthetic data.
- PETs that “focus on hiding and shielding data to help achieve the requirements of the security principle.” Examples include homomorphic encryption (HE) and Zero-Knowledge Proofs (ZKP).
- PETs that split or control access to personal data to help fulfil both data minimization and security principles depending on the nature of the processing. Examples include trusted execution environments (TEEs), secure multiparty computation (SMPC) and federated learning.
Now, let us look at these examples in detail.
Differential privacy
Differential privacy protects from sharing any information about individuals. This cryptographic algorithm adds a statistical noise layer to the dataset that enables describing patterns of groups within the dataset while maintaining the privacy of individuals.
Synthetic data
Synthetic data is artificially created data by using different algorithms, including ML algorithms. It transforms a sensitive dataset into a new dataset with similar statistical properties without revealing information on individuals from the original dataset.
Homomorphic encryption
HE is an encryption method that enables computational operations on encrypted data. It generates an encrypted result which, when decrypted, matches the result of the operations as if they had been performed on unencrypted data (i.e., plaintext). This enables encrypted data to be transferred, analyzed and returned to the data owner, who can decrypt the information and view the results on the original data. Therefore, companies can share sensitive data with third parties for analysis purposes. It is also useful in applications that hold encrypted data in cloud storage. There are three types of HEs:
- Partial HE: can perform one type of operation on encrypted data, such as only additions or only multiplications, but not both.
- Somewhat HE: can perform more than one type of operation (e.g., addition, multiplication) but enables a limited number of operations.
- Fully HE: can perform more than one type of operation and there is no restriction on the number of operations performed.
Zero-knowledge proofs
ZKPs use a set of cryptographic algorithms that allow information to be validated without revealing data that proves it.
Trusted execution environment
A TEE is a segregated area of memory and CPU that is protected from the rest of the CPU using encryption. Any data in the TEE can't be read or tampered with by any code outside that environment. TEE assumes the operating system is untrustworthy and does not allow the operating system to access data stored in the secure area. It can be used when sensitive data needs to be stored safely or there is a need to generate insights from data without revealing the dataset to the party running the analysis or hosting the TEE.
Secure multiparty computation
SMPC is a subfield of HE with one difference: users can compute values from multiple encrypted data sources. Therefore, ML models can be applied to encrypted data since SMPC is used for a larger volume of data. It uses a cryptographic technique called secret sharing, where each participating party’s data is split into fragments and distributed to the other parties.
Federated learning
This is a ML technique that trains an algorithm across multiple decentralized edge devices or servers holding local data samples, without exchanging them. With the decentralization of servers, users can also achieve data minimization by reducing the amount of data that must be retained on a centralized server or in cloud storage.
ISACA has published a white paper, Exploring Practical Considerations and Applications for Privacy Enhancing Technologies, that deep-dives into PETs. The white paper covers the legal and regulatory trends related to privacy and data protection across the geos, describes the PETs, their benefits, limitations, challenges, example applications, and some case studies.
Harnessing data without compromising privacy
PETs provide the tools to harness the power of consumer data without compromising individual privacy. They protect individual privacy and eliminate the risk of data breach, and play an important role in enabling businesses to leverage the deluge of data accessible to them.
By integrating PETs into data collaboration, businesses can ensure that sensitive data can be analyzed and used by multiple parties while still ensuring the data remain confidential and protected from the risk of a data breach.
About the author: Chetan Anand, CDPSE, AGILE SCRUM MASTER, CCIO, CPEW, ICBIS, ICCP, ICOSA, CPISI, ONETRUST FELLOW OF PRIVACY TECHNOLOGY, IRAM2, ISO 27001 LA, ISO 22301 LA, ISO 27701, ISO 31000, ISO 9001 LA, LEAN SIX SIGMA GREEN BELT, NLSIU PRIVACY AND DATA PROTECTION LAWS, SQAM, is the associate vice president of information security and chief information security officer (CISO) at Profinch Solutions, where he oversees all strategic and operational aspects of information security. He has over 20 years of professional experience in information and cybersecurity, business continuity, privacy, risk and quality. He has worked in various industries such as IT, IT-enabled services (ITES), fintech, healthcare, pharmaceuticals, manufacturing, research and development, and in various capacities including technical, managerial and leadership roles. He volunteers with ISACA, was one of the reviewers of CDPSE Review Manual. He currently serves as one of the topic leaders for CDPSE Exam Prep community and a trainer for CDPSE with the Bangalore Chapter. He also volunteers with the Bureau of Indian Standards by participating in International Organization for Standardization (ISO) standards formulation and technical committee work and Information Sharing and Analysis Center as a trainer and contributor to the standards formulation.