Is Web 3.0 our only way out of a world of bad privacy incentives?
And can we salvage web 2.0?
This is part 3 of 4 of a series of musings on the topic of online privacy. I don’t pretend to resolve the problem, simply exploring facets of the space and pulling at strings that may make the web a more wholesome place to explore and help builders think about the moral valence of their technical decisions. View part 2, part 4.
TL;DR — Web 3.0 will unbundle online services by making data storage independent from app logic and ensuring users maintain control over their data. However running such systems is not a silver bullet and the likely outcome is not that Web 3.0 will replace Web 2.0 but rather that the two will coexist and compete. This competition means more tooling with which to build privacy-preserving products across the Web (2.0 and 3.0).
The Internet’s impact on human development is beyond question yet there’s a good case to be made that the web is broken in fundamental ways. The statement should not be so surprising. After all, the Internet was initially designed as a communications network that might withstand nuclear-war, not as a platform that should enable more than 10% of the U.S. GDP, on which our most personal information transits, and which powers critical infrastructure from power plants to toaster ovens.
A number of measures have helped “upgrade” Internet infrastructure to enable this transformation. These range from NIST’s continuous adoption of new cryptographic standards and the development of new data routing algorithms to the emergence of shared data centers, CDNs and many other “cloud” services.
Yet, there is the deeper issue of incentives online. There was no central planning for the development of the Internet economies and societies, and some emergent properties of the system have been detrimental to its end-users, their data, and their privacy.
Accordingly, many in the tech community now advocate for the adoption of protocols that put the end-user back in control of their own data. For them, it is time to move on from Web 2.0, the read-write-interactive web you know today, and to which you upload your own data.
We can understand the development of Web 2.0 advertisement-based business models as stemming in part from the Internet’s architecture itself. In the server-client model trust is established on the basis of where data comes from. It is assumed that the data you request will come from a single authoritative source. Guarantees are made that the data has not been tampered with from source to destination (server to client), but no guarantees are made about the the source’s use of data. This implicit trust in the data-source, and the assumption that the source owns the data in question promotes data monopolies.
One answer to this data-ownership model is the Distributed/Decentralized Web, or DWeb. It operates on a peer-to-peer architecture.
In a peer-to-peer system, the data should be able to come from any peer (i.e. computer in a network), and so the onus of trust is put on the data itself (this is the idea behind content-addressing in IPFS for instance where you are requesting a specific data, rather than that stored at a specific address). Given that the data can come from anywhere its “trustworthiness” cannot depend on the identity of its source.
Taking it farther, in ensuring the data can be served from anywhere, p2p decentralizes data stores. The data is not stored with any single entity. Accordingly it becomes portable across services thereby allowing users (or more likely developers building on their behalf) to more freely compose the services they use. For instance, in such a system you might be able to pick your social network’s friend recommendation engine, messaging system, and terms of service independently from one another since there is no centralized data-lock in to force you to accept a single system.
In this way, the hope of many web 3.0 proponents is that by foregoing the predominant server-client model of distributed applications and decoupling app logic from data storage, power over networks will be less in owning infrastructure or the data that sits on it (the where) but rather in producing or curating quality content using that data.
At this point, I want to distinguish between Web 3.0 and DWeb. They are often used interchangeably, as this new architecture that should power the next generation of the Internet. Nonetheless, I find it helpful to distinguish them as follows:
- DWeb or Distributed Web should mean the set of underlying technical changes that enable a move away from the client-server model toward peered infrastructure. While DWeb implies a deeper change in data ownership, it’s really about infrastructure. It powers all manner of decentralized systems, including Web 3 products (see below), Decentralized Finance applications, and much more.
- Web 3.0 should mean the larger project of building a fully-open, trustless Web. One in which centralized authority by corporations is replaced by transparent algorithmic rulesets everyone has access to (or DAOs). Web 3.0 is a wider technological project to remove centralizing market forces from the web and make data fully portable.
Note that I’ve often heard a separate distinction which goes more like this: DWeb is a decentralized Internet (communication infrastructure), and Web 3.0 the decentralized World Wide Web (a set of applications running on it). In this definition, Web 3 seems to be a subset of the DWeb, and both involve cryptocurrencies often. Moving on…
Now, there is a question of incentives for Web 3.0. The hope is that 3.0 incentives align the user interest with that of the service-provider by removing the centralizing forces that help create data monopolies. But then why would anyone run infrastructure to benefit another party since it no longer means privileged access to that party’s data, as it often did in Web 2.0? How does the network operate without a centralized overseer? Volunteer networks like Tor work to some extent, but can web-scale systems fully depend on volunteers?
The web 3.0 answer is that they don’t need to. In order to operate within the distributed system, all of its participants (or nodes) have to agree on its state. This agreement is achieved thanks to certain participants who run a consensus algorithm which anyone can verify (in that sense the system is “trustless”, there’s no need to take anyone’s word for its state). In many distributed systems, like Bitcoin, Ethereum or Filecoin, those participants earn a system-native currency in exchange: a cryptocurrency.
These sorts of distributed systems allow for alternative business models to the Web 2.0 ones that pitted creator interests against those of their users. In that sense, the pitch for Web 3.0 is that crypto-native tokens will solve the incentive problem of Web 2.0. But what does that actually mean?
Let’s start with a couple facts:
- One —Tokenization means running incentivized protocols is now revenue-generating.
Running traditional Internet infrastructure does not generate revenue in itself. Large companies who power the Internet must find ways to monetize. They can do so by charging customers directly (e.g. as DigitalOcean, Cloudflare, Akamai or Verizon do), or indirectly by finding alternative ways of deriving revenue from their operations. This typically means monetizing data ownership (as Google or Facebook do) and building moats around this data to prevent others from mining the same (data) vein.
In Web 3.0, protocols are directly monetizable thanks to tokens. You earn Bitcoin for tallying account balances on that financial network, Ethereum for running computation on that world-computer, Filecoin for providing storage, etc… - Two — Decentralization means substituting legal-recourse for cryptographic proof.
Trust between Internet service-providers and consumers is ultimately based on contracts and regulation. Certainly, there’s goodwill and brand reputation: you know the companies who serve you and expect they’ll do right by you because it would be costly for them not to (word would surely get out). But when it comes down to it, you expect AWS stores your data because you believe you can sue them if you discover they do not. You expect Google is not selling your family photos because you assume some law prevents them from doing so. Trust is mediated: enforcement in case of breach of contract must come from a lawyer or the government.
Because of decentralization, Web 3.0 is forced to rely on cryptographic proofs to create trust. You don’t really know who your service provider is or what law they operate under. In the cryptographic setting, no need to assume anymore. You can verify.
Accordingly, the argument that Web 3.0 aligns creator and user incentives goes something like this:
- “The Revenue Argument” — Enabling infrastructure providers to earn revenue from running protocols removes the need to monetize data. Creators will earn money immediately if they create a service of value to consumers. E.g. If Facebook earned revenue from each HTTP request they ran, they might not need sell your data.
- “The Competition Argument” — Decoupling revenue from data-ownership fosters more innovation. Consumers bringing their own data (BYOD) to the services they use removes moats in businesses with network effects, like social networks or marketplaces. This means more competition and so better product experiences.
- “The Transparency/Verifiability argument” — Basing trust on cryptography means direct accountability for rule breakers. In Web 3.0, not abiding by the rules of a cryptographic system leads to immediate loss of revenue or punitive action. Removing the uncertainty of “will my wrongdoing be discovered” and “would I really lose this lawsuit” changes the risk-calculus for service providers.
Sounds peachy.
Obviously, in practice things are far more complex.
To start, cryptographic protocols are extremely complex. In a sense, we are substituting trust in the legislator/lawyer for trust in the cryptographer. I don’t know about you, but I won’t be auditing the SNARK circuits of zero-knowledge proofs running in Web 3.0 protocols on my own.
Beyond that, the incentive-alignment outlined above gives rise to new centers of power and new forms of potential misalignment. There is tension between the interests of developers, miners and users. Simply look at how Bitcoin protocol upgrades have gone.
Further, even if incentives are more straightforward for some of the protocols that make up the backbone of Web 3.0 (we call these layer 1 protocols), they become much more complex as you go up layers to provide services on top of the Web 3.0 backbone. There is no shortage of sketchy stuff happening in Web 3.0 and the transparency of clear incentives disappears as you compose protocols.
Finally, there’s a question of the cost of running these decentralized protocols. We’re talking about a multiparty computation run over an arbitrarily large network… There are obviously high costs to running such systems, or coordinating decentralized parties. These can be felt in the crappy product experiences of Web 3.0 today.
It’s fair to expect that many of these costs will go away as decentralized infrastructure matures (much as centralized infra has over the last few decades), and new Web 3.0 UX is invented. But does that mean every product on the web today should eventually move to Web 3.0?
As with everything, it’s a question of trade-offs. Tomorrow’s creators should still be asking themselves “Should my product be decentralized?”
Specifically,
- Should it run on distributed infrastructure (on the Dweb)? The costs involved here are computational; we can assume bandwidth cost will be more than made up for as compared to server-client.
- Should it run without centralized authority (as a Web 3.0 product)? The costs involved here are computational and bandwidth as well as overall system complexity and dependency on third parties to run your system.
To answer these questions, ask yourself:
- Are there clear network effects involved here?
- Can decentralized incentives or governance prove useful in solving key coordination problems in the system?
- Are there local factors to counter economies of scale (which play well to centralization)?
- Is trustlessness or verifiable transparency a clear product requirement?
For the majority of online products today, the answer is no: decentralization is probably not worth it. Most digital products today work far better as centralized services. Far from being a crypto-bear, this is specifically what is excites me about the space: I expect that Web 3.0 will do to the web what mobile did to personal computing. Web 3.0 will not fully replace or destroy the Web as we know it, it will grow our digital pie by enabling us to have new products experiences we couldn’t have had before. Just as smartphones/mobile did not destroy desktop/laptop/server usage, nor will the decentralized web replace the centralized web wholesale. It will increase the number and types of digital interactions humans can have. We’re seeing this today in parts of Defi (decentralized finance) where social/community dynamics are driving change in a way they would not in a centralized system.
Besides, there are some things to fear from Web 3.0: complete transparency as a default in an open-system is not exactly conducive to privacy-by-default. Likewise, the censorship-resistance of decentralized systems (there is no central curator) may lead to more unchecked proliferation of harmful content across the web. Tokenization may lead to the outright financialization of human behaviors through algorithmic rule-making. The list goes on.
I expect we are headed to a future in which Web 3.0 and 2.0 compete and coexist. This is a future in which product architecture and data governance decisions are shaped by a more honest accounting of the technical and ethical trade-offs involved. Let me make a quick case for this future.
This statement may be anathema in some circles, but trust is a necessary part of any functioning society. Not blind trust in institutions or one another, but the form of trust-and-verify that has led to the development of modern society. Besides which, any provable trust in a digital system must always end up in some societal trust.
So let’s ask ourselves: should a corporation like Google know everything you do online at any moment? Certainly not. But is the only rational alternative one in which you are anonymous or pseudonymous at all times? Perhaps not. Understanding what you are giving up and having optionality as to what product/privacy tradeoff you make is core to privacy-by-default, but I worry about full-anonymity as a default. To what extent will it help dehumanize interactions, remove accountability from personal action and so fuel polarization and vitriol? While the Web’s enablement of mass surveillance is an existential problem, I shudder to think about a society in which everyone always wears a mask or a hood online. It’s important not to conflate data dignity and online anonymity.
While this would mean users may continue to have to trust some of their centralized service providers, there’s hope yet! The many privacy-preserving primitives left underdeveloped by Web 2.0 companies (who had little incentive to invest in them) are maturing in Web 3.0. The cryptocurrency space has proven to be an amazing testing ground for networking and cryptographic systems (quoting Dan Boneh here). A lot of the recent productization of tooling like zero-knowledge proofs, new applications for certain signature schemes, or forms of multiparty computation has taken place on the blockchain. This is a huge tail-wind for companies who want to build privacy-preserving products, centralized or not. Web 3.0 will prevent product-makers from using flimsy technical excuses when justifying why they’re screwing their users over and lead to more choice online.
Back to our original question: Is Web 3.0 our only way out of a world of bad privacy incentives? It’s complicated, but in short, no. The advent of Web 3.0 will not fully solve all fundamental issues related to data stewardship and trust in the services we use.
While it offers some solutions to bad incentives that exist in Web 2.0, decentralization is not the right answer to every privacy problem. Yet, there’s much to be hopeful about with regards to the Internet’s evolution and potential paths out of the data lock-in we face at an application and infrastructure level.
We can hope that in competing with centralized alternatives, decentralization and the transparency it entails will rid us of worst-offender data monopolies and help align creator and user incentives.
All of this might enable end-users to make more deliberate choices online. But it’s important to remember that decentralization is not in itself a guarantee of respect for user data. Likewise, it’s important not to take any of this development for granted. We need to build better tooling to make privacy-by-default easier, and enable creators to build accountable systems and more easily communicate their data decisions with their users.
The choice for creators today still looks too often like a choice between caring about privacy or building a shitty product. Thankfully, we’re seeing more and more products trying to address this issue.