Both, crypto and AI, have seen remarkable progress in the past few years.
Crypto celebrated successes like DeFi, and more recently DeSci.
Back in 2018, Peter Thiel pointed to the tension between the decentralizing forces of crypto and the centralizing forces of AI, coining the term “Crypto is libertarian, AI is Communist.” Here I want to argue that we may learn something by combining the two.
Why? Because skills and approaches honed by the security and crypto community have the potential to unlock useful applications of AI and mitigate AI risks.
Eliezer Yudkowsky, an eminent figure in AI safety, recently made a surprising appearance on the Bankless Podcast, a distinctly Web3 podcast.
It was surprising for two reasons:
First, Eliezer thinks we are on a rapid path to developing Artificial General Intelligence (AGI) that can perform virtually all tasks that humans, and that such AGI will very likely kill us all.
Second, when asked if there is anything one may do to increase the tiny chance we could survive, he encouraged security and cryptography-oriented people with a strong security mindset to take a hand at AI alignment.
Let’s unpack that. First, we’ll discuss why we should worry about AGI, before zooming into the promises that the crypto (here meaning primarily cryptography) and security community has for mitigating some of the dangers of AGI.
As anyone glimpsing the news recently can attest, no week goes by without progress in AI accelerating dramatically. In case you missed it, here are just three crucial developments:
First, there has been a push toward more centralization of AI, for instance by Microsoft investing in OpenAI, Google investing in OpenAI’s competitor Anthropic, and DeepMind and Google Brain merging into one organization.
Second, there has been a push for more generalized AI. The recent paper “GPT4: Sparks of Artificial General Intelligence” showed how GPT-4 already demonstrates first instances of theory of mind, a measure usually used to assess human intelligence.
Third, there has been a push for more agency in AI systems, with AutoGPT becoming more agentic by re-prompting itself to achieve more complex tasks.
Back in December, Metaculus, a forecasting platform, predicted the arrival of AGI roughly in the year 2039. Now, in May, the date is at 2031 – in other words, an eight-year timeline drop within five months of AI progress.
If we take these developments as signs that we are on the path toward Artificial General Intelligence, the next question is why is AGI safety considered so hard?
Arguably, we can break the problem of AGI safety down into three sub problems:
Alignment: How can we align AI with human values?
AI alignment is the simple question of how do we get AIs to align with our values. But it’s easy to forget that we don’t even agree on what our values are. Since the dawn of civilization, philosophers and mere mortals alike have argued about ethics, with convincing points on all sides. That’s why our current civilization arrived, mostly, at value pluralism (the idea of humans with conflicting values peacefully co-existing). That works for a diversity of human values but is a difficult thing to implement into one artificially intelligent agent.
Let’s imagine for a sweet minute that we knew, roughly, what moral values to equip the AGI with. Next, we need to communicate these human values to a silicon-based entity that doesn’t share human evolution, mind-architecture, or context. When humans coordinate with other humans, we can rely on plenty of shared implicit background knowledge since we share our species’ biology, evolutionary history, and often even some cultural context. With AI, we cannot rely on such a common context.
Another problem is that, for the pursuit of any goal, it’s generally instrumentally useful to be alive and to acquire more resources. This means that, an AI set to pursue a specific goal could resist being shut down and seek more and more resources. Given the countless possibilities in which an AI could achieve goals that include human injury, neglect, deceit, and more, and given how hard it is to predict and specify all those constraints in advance in a reliable way, the job of technical alignment is daunting.
Even if humans agree on a set of values, and figure out how to technically align an AGI with them, we still can’t expect it to act reliably without proof that the underlying software and hardware is itself reliable. Given the sizable advantage that AGI conveys to its creators, malicious hackers may sabotage or reprogram the AGI.
Further out, an unintentional bug could interfere with the AGI’s goal execution or the AGI could itself exploit vulnerabilities in its own code, for instance by reprogramming itself in dangerous ways.
Unfortunately, we have built today’s entire multi-trillion-dollar ecosystem on insecure cyber foundations. Most of our physical infrastructure is based on hackable systems, such as the electric grid, our nuclear weapon technology. In the future, even insecure self-driving cars and autonomous drones could be hacked to turn into killer bots. Mounting cyberattacks such as Sputnick or Solarwinds are severe but may be benign when compared to potential future AG-enabled attacks. Our lack of meaningful response to these attacks suggests that we are not up to the task of AGI-safe security which may require rebuilding much of our insecure infrastructure.
By leveraging technologies and skills in the security and cryptography communities, we may be able to pursue a multipolar superintelligence scenario
Making progress on alignment and security of AGI could take time, which makes it important for actors building AGI to coordinate along the way. Unfortunately, incentivizing major AI actors (this may be cooperations or nation states) to cooperate and avoid spurring arms race dynamics to get to AGI first is not that straight forward. Catastrophe takes only one actor to defect from an agreement, meaning that even if everyone else cooperates, if one races ahead, they secure a decisive advantage. This first mover advantage persists until AGI is built and given the power that the unitary deployment of AGI system may convey on its owner, and it is a difficult temptation for the owner to forgo.
Perhaps you have nodded along so far: Yes, sure, AI safety is really hard. But what in the world does crypto have to do with it?
Given the rapid pace of AI progress, and the difficulties in making it safe, the traditional concern is that we are racing toward an AGI singleton scenario, in which an AGI displaces human civilization as the overall framework of relevance for intelligence and dominates the world, potentially killing humanity along the way.
By leveraging technologies and skills in the security and cryptography communities, we may be able to change course to instead pursue a multipolar superintelligence scenario, in which networks of humans and AIs securely cooperate to compose their local knowledge into the collective superintelligence of civilization.
This is a big, abstract claim, so let’s unpack how exactly the crypto and security communities could help tame AI risks and unleash AI’s beauty by unlocking new applications.
How can security and cryptography tame AI risks?
Paul Christiano, a reputable AI safety researcher, suggests that AI desperately needs more red-teaming, usually a term used in computer security to refer to simulated cyber attacks. Red-teams in the AI context could, for instance, be used to search for inputs that cause catastrophic behaviors in machine learning systems.
Red-teaming is also something the crypto community has experience with. Both Bitcoin and Ethereum are developing in an environment that is under continuous adversarial attack, because insecure projects pose the equivalent of multimillion-dollar cryptocurrency “bug bounties.”
Non-bulletproof systems are eliminated, leaving only more bulletproof systems within the ecosystem. Crypto projects undergo a level of adversarial testing that can be a good inspiration for systems capable of withstanding cyberattacks that would devastate conventional software.
A second problem in AI is that multiple emerging AIs may eventually collude to overthrow humanity. For instance, “AI Safety via Debate,” a popular alignment strategy, relies on two AIs debating topics with each other, with a human judge in the loop deciding who wins. However, one thing the human judge may not be able to exclude is that both AIs are colluding against her, with none promoting the true result.
Again, crypto has experience with avoiding collusion problems, such as the Sybil attack, which uses a single node to operate many active fake identities to covertly gain the majority of influence in the network. To avoid this, a significant amount of work on mechanism design is emerging within crypto, and some may have useful lessons for AI collusion, too.
Checks and balances
Another promising safety approach currently explored by OpenAI competitor Anthropic is “Constitutional AI,” in which one AI supervises another AI using rules and principles given by a human. This is inspired by the U.S. Constitution design, which sets up conflicting interests and limited means in a system of checks and balances.
Again, security and cryptography communities are well-experienced with constitution-like checks and balance arrangements. For instance, the security principle, POLA – Principle of Least Authority – demands that an entity should have access only to the least amount of information and resources necessary to do its job. A useful principle to consider when building more advanced AI systems, too.
Those are just three examples of many, giving a taste of how the type of security mindset that is prominent in security and crypto communities could aid with AI alignment challenges.
In addition to the AI safety problems you may try your hand at, let’s look at a few cases in which crypto security innovations cannot just help tame AI, but also unleash its beauty, for instance by enabling novel beneficial applications.
There are a few areas that traditional AI can’t really touch, in particular solving problems that require sensitive data like individuals’ health information or financial data that have strong privacy constraints.
Fortunately, as pointed out by cryptography researcher Georgios Kaissis, those are areas in which cryptographic and auxiliary approaches, such as federated learning, differential privacy, homomorphic encryption and more, shine. These emerging approaches to computation can tackle large sensitive datasets while maintaining privacy, and thus have a comparative advantage over centralized AI.
Leveraging local knowledge
Another area traditional AI struggles with is sourcing the local knowledge that is often required to solve edge cases in machine learning (ML) that big data cannot make sense of.
The crypto ecosystem could aid with local data provision by establishing marketplaces in which developers can use incentives to attract better local data for their algorithms. For instance, Coinbase co-founder Fred Ehrsam suggests combining private ML that allows for the training of sensitive data with blockchain-based incentives that attract better data into blockchain-based data and ML marketplaces. While it may not be feasible or safe to open source the actual training of ML models, data market places could pay creators for the fair share of their data contributions.
Looking more long-term, it may even be possible to leverage cryptographic approaches to build AI systems that are both more secure and powerful.
For instance, cryptography researcher Andrew Trask suggests using homomorphic encryption to fully encrypt a neural network. If possible, this means that the intelligence of the network would be safeguarded against theft, enabling actors to cooperate on specific problems using their models and data, without revealing the inputs.
More importantly, though, if the AI is homomorphically encrypted, then the outside world is perceived by it to be encrypted. The human who controls the secret key could unlock individual predictions that the AI makes, rather than letting the AI out into the wild itself.
Again, these are just three examples of potentially many, in which crypto can unlock new use cases for AI.
The examples of memes controlling memes and of institutions controlling institutions also suggest that AI systems can control AI systems
Centralized AI suffers from single points of failure. It would not only compress complex human value pluralism into one objective function. It is also prone to error, internal corruption and external attack. Secure multipolar systems, as built by the security and cryptography community, on the other hand, have lots of promise; they support value pluralism, can provide red-teaming, checks and balances, and are antifragile.
There are also plenty of disadvantages of cryptographic systems. For instance, cryptography requires progress in decentralized data storage, functional encryption, adversarial testing, and computational bottlenecks that make these approaches still prohibitively slow and expensive. Moreover, decentralized systems are also less stable than centralized systems, and susceptible to rogue actors that always have an incentive to collude or otherwise overthrow the system to dominate it.
Nevertheless, given the rapid speed of AI, and the relative lack of security and cryptography minded folks in AI, it is perhaps not too early to consider if you could possibly meaningfully contribute to AI, bringing some of the benefits discussed here to the table.
The promise of secure multipolar AI was well-summed up by Eric Drexler, a technology pioneer, back in 1986: “The examples of memes controlling memes and of institutions controlling institutions also suggest that AI systems can control AI systems.”