Does the key to AI superalignment lie in the science of cooperation?
In biology, cooperative alignment within hierarchical systems is essential to survival. Are there lessons here for human flourishing an an age of artificial intelligence?
It was one of those double-take moments in a conversation where two seemingly different worlds meshed, and something quite unexpected emerged.
I was talking with two good friends and colleagues on the latest Future of Being Human … Unplugged broadcast, when the topic of OpenAI’s recent research on “superalignment” came up.
We were discussing cooperation science and AI, but at this particular point in the conversation we were talking about cooperation in biological systems.
What happened next was one of those serendipitous moments that academics like me live for, where ideas from different disciplines suddenly align and new insights begin to emerge.
Fo this episode of Unplugged I was joined by Mark Daley, Chief AI Officer at Western University, and Athena Aktipis, a leading expert in the science of cooperation at Arizona State University. Like all episodes in the series, the aim was to tease out new insights and ideas through casual and candid conversation — in this case focused on AI and cooperation.
The first 30 minutes of the conversation were illuminating — and began to draw out exactly the sorts of cross-disciplinary insights I was hoping for.
Then Mark mentioned Ilya Sutskever’s recent work on Superalignment, and things got interesting.
At this point I asked Athena about how the concept of superalignment resonated with biological systems. This is how the conversation went:
As with many such moments where disciplines collide, there’s nothing particularly new here from a biological perspective — hierarchical and cooperative systems in biology have been studied for years.
What was new — or was certainly new to me — was the insight into how the emerging concept of superalignment has such deep parallels with what exists in biology and the success of robust and resilient biological systems.
As the conversation progressed, it was very clear that highly functional and complex multi-agent systems exist where what might be considered “less intelligent” agents manipulate the behavior of “more intelligent” agents to their benefit — as well as the benefit of the system as a whole.
From a biological perspective, these relationships rely on aligned interests, and information flow through the organizational hierarchy in ways that are beneficial from an evolutionary perspective.
Within this — at least as Athena suggested — intelligence is an emergent property that supports the system at a meta level, but is not necessarily the primary purpose or the pinnacle achievement of the system.
Here, we talked a little about human exceptionalism, and how this can blind us to the multifaceted roles and behaviors of different agents within a complex system — including how viruses very effectively manipulate complex biological systems without an ounce of what we might consider “intelligence” from an AI perspective.
This lead to an intriguing foray into the possibility of humans acting as metaphorical viruses in a world of superintelligent AIs … although I’m not sure how far anyone would want to push this analogy.
Beyond this though, it was very clear to me from the conversation that there are deep wells of insight within the science of cooperation that are directly relevant to the development of increasingly powerful AIs within a thriving human society.
There was, of course, a lot more to the conversation — which was riveting from beginning to end — and I would highly recommend listening to it in its entirety — either on YouTube or through the audio only version below:
In the full broadcast we discussed the hypothesis that the “Universe, at its most fundamental level, is information” (Mark’s words), and how this impacts how we think of complex hierarchical systems — especially in terms of flow of information — irrespective of whether they are biological, digital, or embedded in some other substrate.
We touched on honest versus dishonest information flow, and how cooperative systems can be hijacked — cancers being a case in point.
We talked about how you can only have such challenges if cooperation exists first, and how threats to cooperative systems build resilience and immunity — a concept that suggests too much risk averseness is a bad thing.
We explored the importance of the beneficial division of labor in cooperative systems
We brought in John Maynard Keynes and signaling theory at one point (that was Athena).
And we ended up — bizarrely but appropriately — on what AI developers can lean from kombucha — which is itself a highly fascinating and complex cooperative system.
At the end of our conversation I was left with a very clear sense that the emerging science of cooperation has a wealth of insights that are deeply important to the beneficial development of transformative AI — especially where cooperation is approached from the context of information, compute, and aligned interests within hierarchical multi-agent systems, irrespective of the “substrates” within which it resides.
Of course, there are counterpoints to this — as was apparent in our previous Unplugged episode where we touched on parasitic behaviors in evolved systems.
Either way though, there seems to be a compelling case for more collaboration — and even cooperation — between AI developers and cooperation scientists if we’re going to get AI right.
All of the Future of Being Human … Unplugged conversations can be accessed on the broadcast’s YouTube channel. And if you’d like an early heads-up on our Spring season, it’s worth signing up to the ASU Future of being Human initiative mailing list.