Knowledge Burst Blog Series
How You Handle Data Privacy Will Determine If You Win Or Lose In A Web3 World
By Frank Ricotta, Chief Executive Officer and Tyson Henry, Chief Technology Officer
PETs/PECs/PEDs and Why They Matter
As you scroll though TikTok in the doctor’s waiting room, you may not give much thought to your data. But there are people and companies and governments who spend a lot of time thinking about your data. And if that doesn’t worry you, maybe it should.
Our whole world turned to online solutions when the pandemic hit. Everything from school and work to groceries and happy hours went digital, putting more of our data into the hands of companies that didn’t put a lot of thought into their privacy controls and cybersecurity.
On average, each person created at least 1.7Mb of data per second in 2020. Yep, per second. The big tech companies that control all that data are using it to build immersive metaverse experiences so they can, well, get more data. And with recent reports of mass surveillance by governments and big tech, you can see why regulators are scrambling to adopt regulations to support individual privacy.
In fact, the Federal Trade Commission just initiated a rulemaking process related to consumer privacy and Congress released a discussion draft of a comprehensive data privacy bill, the American Data Privacy and Protection Act (“ADPPA”).
Let’s face it: the current generation of technology requires that participants play by the rules. But we know they don’t. Hackers definitely don’t play by the rules and one can easily make the case that big tech and governments don’t either.
Consumers are tired of this infringement to our basic human rights. A KPMG survey revealed “86% of the respondents feel a growing concern about data privacy, while 78% expressed fears about the amount of data being collected. Some 40% of the consumers surveyed don’t trust companies to use their data ethically, and 13% don’t even trust their own employers.”
What are people doing about it? They’re driving an increasing demand for privacy-based products and services. So, here’s the 21st century challenge for companies: How can a business develop products that leverage people’s data to increase engagement and help people connect and communicate with each other, but do so in a way that fully respects people’s privacy and data rights?
Privacy Enhancing Technology
If you haven’t started thinking about that question, about your company’s privacy strategy and how the next wave of privacy regulations will impact your business, you’re already behind the curve. Privacy Enhancing Technology (PET), also known as Privacy Enhanced Computation (PEC), was identified as one of the top three key technology trends by Gartner in 2021 and again in 2022. In other words, if PET/PEC isn’t on your radar, it should be.
Gartner defines the PET/PEC category as follows:
“Privacy-enhancing computation secures the process of personal data in untrusted environments – which is increasingly critical due to evolving privacy and data protection laws as well as growing consumer concerns. Privacy enhancing computation utilizes a variety of privacy protection techniques to allow value to be extracted from data while still meeting compliance requirements.”
It’s interesting to note that Gartner’s definition of the PET/PEC has changed since 2021, from being mostly focused on organizational and cross-organizational use of data to a much broader definition centered on consumer privacy. Given today’s accelerating pace of digital transformation, the expansion of this category is well-warranted.
As veterans in the privacy and cybersecurity space, we believe the current generation of PET/PEC is fast approaching its event horizon. While there are some technical advances that may slightly extend the lifespan of legacy PET/PEC systems, there exists a fundamental flaw in the underlying premise: legacy PET/PEC systems assume that data is dumb. As such, they take a very traditional systems approach to data protection: controlling access to the database, where and how the data may be processed, and standard encryption techniques to hide sensitive information such as identity.
But data isn’t dumb anymore. Or at least, it doesn’t need to be. Next-generation PET and PEC is moving towards a more data-centric approach that uses what we’ll call Privacy Enhanced Data (PED). In the BurstIQ ecosystem, we call this smart data.
Types of PET/PEC
Overall, the goal of PET/PEC is to protect the inadvertent disclosure of sensitive data such as Personally Identifiable Information (PII) and Personal Health Information (PHI).
PETs and PECs have three fundamental characteristics:
- They rely on a trusted environment to perform analysis and processing of sensitive data
- These environments can be extended to operate in a decentralized fashion
- They use encryption and algorithms to protect the exposure of sensitive data prior to analysis and processing
What exactly would be considered a PET or PEC? Well, there really isn’t a definitive list, but there are some existing and emergent technologies that are commonly associated with the PET/PEC space.
Homomorphic Encryption (HE)
Homomorphic encryption is one of the most commonly cited types of PET/PEC. Simply stated, homomorphic encryption means that data remains confidential or encrypted but can still be processed and used without compromising the underlying data or seeing the underlying values. Until recently, the process was simply too slow for any real-time transactional systems or analytics. However, computational technologies are now reaching a point where computational speeds are making this technique feasible. As compute speeds continue to improve, homomorphic encryption will likely become core to PET/PEC moving forward.
Secure Enclaves
Secure Enclaves, also known as trusted execution environments (TEE), have a very energetic following: They protect data by trusting (or not trusting) a device, establishing access locations, and enforcing more robust authentication protocols. In other words, trusted execution environments use perimeter protection mechanisms. While these methods are popular from a traditional data processing perspective, they aren’t well suited for today’s distributed, environment. While there will still be a place for this tech moving forward, it will play a more limited, complementary role to the core data-centric techniques.
Data Processing Techniques
There are a few different processing methods that manage data privacy by enabling data analysis and processing without the need for the full underlying data set. These include:
- Differential Privacy uses techniques to add statistical ‘noise’ before a computation, so the process is able to generate general data sets and statistically relevant results but hides the granular data set.
- Secure Multi-Party Computation (SMPC) is a cryptographic protocol that allows multiple parties to distribute the computational workload and analyze data without seeing the entire data set.
- Anonymized Data Sets tokenize specific fields that could be used identifiers into a single composite encrypted token that prevents any reverse engineering. Note: given the advancing analytics and AI methods, it is becoming harder and harder to anonymize data and maintain statistical significance.
- Synthetic data sets are data sets that are generated from the real data set to be within the real world limits, boundaries, and relationships of the real data.
As PET/PEC evolves, these and other data processing techniques will be used, but probably not in the way they are currently deployed. Instead of acting as a data preparation process, they need to be applied in real-time. In other words, privacy-enhancing data processing needs to be a seamless part of the overall transactional flow.
There is also a growing list of PET/PEC focusing on AI. These include things such as:
Federated Learning
At first blush, federated learning may not seem relevant to PET/PEC. However, federated learning relies on the principle that data itself never leaves its home. Instead, the learning algorithms are structured in a way that learning happens in a distributed fashion. Thus, no one node sees the entire data set. In this way, federated learning is similar to multi-party computation and achieves a level of security through obfuscation.
Generative Adversarial Networks (GANs)
Now we’re getting into some advanced technology. GANs are a form of AI where one AI focuses on learning while the other one (or many) focuses on judging the behavior of the first. While not necessarily focused on data privacy per se, GANs will support how data is used (or more importantly, not misused) in intelligent systems.
Key Challenges for PET/PEC
As the world moves into Web3, Federated Learning and GANs become even more relevant. For most companies, it’s no longer enough to use PET to securely store their data and protect it from bad actors. That’s table stakes. They need to securely manage their own data, securely access data from partners and collaborators, securely run sophisticated intelligence operations on that data, and do all that at scale. A tall order for any company.
While the technologies described above are promising, they are fragmented and costly to implement if an organization desires to enforce privacy throughout the full data lifecycle.
In addition, the majority of PET/PEC techniques require replicating data sets. This presents several issues. First, data replication quickly leads to inconsistencies in the data and exponential growth in the number of copies of the data set. Even with the most advanced current PEC/PET techniques, it is nearly impossible to fully identify the original source of truth. Second, if we assume that data is dumb, which current PET/PEC does, protection of data once it leaves the original controlled environment is impossible because the control methods don’t follow the data.
So while current PET/PEC techniques are helpful in securing the data your company holds, they do nothing to secure the data you share. And in many cases, they make it much harder to run the critical intelligence operations that give your products value.
Introducing Privacy Enhanced Data (PED)… Smart Data
In a traditional PET/PEC environment, attributes like metadata, edge relationships, ownership, and use permissions are maintained separately from the data itself. Anyone who has tried to create and manage comprehensive data dictionaries and metadata dictionaries understands how difficult it can be to maintain the currency and accuracy of these directories.
Privacy Enhanced Data, or smart data, represents the next evolution in data privacy and intelligence. It is fundamentally a new data construct that fuses data attributes such as metadata, edge relationships, ownership, and use permissions into a new data object that is cryptographically signed and attested. In early publications, this was often referred to as self-aware data objects.
By fusing these attributes with the data itself, the smart data object is just that – smart. Certain embedded attributes, such as metadata and edge relationships, give smart data context. This context can be used in real-time to configure and drive the behavior of the processing systems. Instead of hard coding all the logic in the application or control systems, smart data is like having a real-time logic plug-in that drives the behavior of each data object independently.
The analytical and intelligence power of this cannot be overstated. If each smart data object can independently operate in processing systems based on its unique attributes, the system as a whole is able to learn, adapt, and optimize far more quickly than in traditional models. With the explosive growth in data volumes, the deep intelligence that companies are trying to glean from that data, and the broader shift into Web3, smart data thrives where traditional data models have faltered.
In addition to providing context, a smart data object contains trust attributes. First and foremost, the smart data object embeds and enforces ownership and use permissions, so data security remains intact even as the data is moved and replicated. In addition, trust attributes provide a detailed audit of how the data has been changed or updated over time, how ownership and use permissions have changed, and whether the data has been authenticated or verified by a trusted entity.
By embedding trust attributes within the data, smart data solves another tough problem: data integrity and privacy. Because ownership and use permissions are part of the data itself, the ability to revoke permissions becomes a standard feature.
Why does all this matter? Smart data disconnects data from its central control systems, so data security and intelligence are as mobile as the data itself. This frees the data to be shared, replicated, and updated – all without compromising the data security, integrity, and intelligence that are required to run your Web3 business.
The Role of Blockchain
“You can’t store data on a blockchain.” It’s a statement we hear often. But having worked with highly sensitive data and cryptographic security methods for over 30 years, we can confidently say, “You’re wrong.”
Many of the “experts” who jumped on the blockchain bandwagon in 2017 and 2018 see blockchain as synonymous with distributed ledger technology (DLT). Those of us who have been working in this space for decades know that blockchain is actually comprised of multiple technologies that were combined to form what we now think of as blockchain.
By expanding our definition of blockchain beyond a limited DLT focus, we can see that same methods that are used to create DLTs and cryptographic data objects (aka, blocks) can be used to create privacy-enhanced data (PED).
The core technologies inherent in blockchain allow us to assign ownership to a piece of data, manage permissions to that data, and establish data integrity and provenance. Once these core security layers are embedded into the data itself, the formerly “dumb” data object becomes a smart data object (i.e., a PED) which can be stored on chain, shared on chain, and even deleted (yes, deleted) on chain.
When coupled with PET and PEC techniques, PEDs provide a very robust layered privacy model. At BurstIQ, we bring together the fundamental methods of PET, PEC, and PED (in combination with advanced data modeling techniques) to create LifeGraph. LifeGraph uses PET, PEC, and PED constructs to create a highly secure and contextualized data picture of a person – their digital DNA. Smart data is embedded into a robust protocol stack that uses blockchain methods at multiple layers, enforcing data immutability, ownership, provenance, state, and, through specialized smart contracts, chain of custody, use, and access.
By combining PET, PEC, and PED technologies, it becomes possible to decentralize privacy and intelligence without one compromising the other. It becomes possible to maintain data trust, integrity, lineage, and security even as data moves around a decentralized ecosystem. And it becomes possible to run true distributed intelligence and gain deeper insights from it through the additional layers of context embedded in smart data.
In short, smart data makes it possible to deliver on the promise of Web3. If you’d like to learn more about Web3 check out our Blog here.
The Promise of Protected Data
This is both the fundamental opportunity of Web3 and its fundamental challenge – to create a future in which data is connected, individual privacy and ownership rights are respected, incentives are aligned, and trust is inherent.
Given that the whole purpose of PET/PEC is to protect data, protect privacy, and protect against misuse, coupling PET/PEC capabilities together with PED makes it possible to create a world in which the digital rights and dignity of every person are respected, no matter what. A world in which corporate profits are determined by the degree to which a company can offer people value in exchange for the privilege of accessing their data, rather than the degree to which they can control and hoard that data.
Regulations and consumer demand are forcing companies to adapt to a Web3 world. Companies that adopt a Web3 data strategy are benefitting from greater consumer trust, higher value products, and increased market share. Most importantly, people are being empowered with the means to own, control, and derive value from their data in the way that works for them. And that is the true promise of Web3.
If you’d like more info, check out what BurstIQ’s Web3-ready platform with advanced PET/PEC/PED capabilities can do for you. Better yet, let’s talk, and we’ll show you a demo of how the most advanced data platform on the planet can help make data privacy central to everything you do.
To learn more about this topic, check out the following publications and webinars:
About BurstIQ
BurstIQ fuels trust-first digital strategies with human data. LifeGraphs® take the complexity out of managing sensitive human data, freeing organizations to build trust through hyper-personalized health, work, and life digital experiences. In an era of data abundance, LifeGraphs promote trust between organizations and the individuals providing data through blockchain-powered governance and consent. The LifeGraph ecosystem provides a single source of truth and an intelligent ecosystem, helping businesses gain a deep understanding of the people they serve. Armed with granular insights, they can deliver more value in digital experiences and make an increasingly digital world more human.