Ilya Sutskever's new AI venture on safe superintelligence is bold. But it also indicates a deeply naive understanding of safety that may be its undoing
I interpreted the proclamations differently, though. IMO, the SSI founders know their goals are nuanced and aspirational. They would agree wholly with this article.
They did not mean to undersell the challenges in pursuing safe AI, not the least of which is defining "safety." If anything, they wanted to do the opposite: Proclaim that pursuing Safe AI is too important a goal to be burdened by the pressure to ship products tomorrow.
The SSI tweet concluded with a recruitment pitch. When recruiting or fundraising, one needs to communicate a clear aspirational purpose, a big moonshot goal. One has to assume those "in the game" know the devil is in the details and that nuance is unnecessary for attention-grabbing 140-character soundbites.
The measured use of marketing copy can be strategic, especially when competing with Silicon Valley for the best talent.
Look people, I honestly don't know what this article is attempting to say but here is my take on it.
1. Risk can be quantified. We have decisions theory. We can lay out all of the possibilities that are allowed by selection rules in physics and assign a probability to each of them. Even doing this order of magnitude gives us a handle on things. Think 1 in 100, one in a million, 10^‐100, stuff like that.
We then assign a cost function to each possibility in the probability tree. What is the cost of waking up late? A car accident? A family member dying? Nuclear war? A black hole consuming the earth and ending all life on earth forever?
Now that we have these probability-cost products in our tree, we set thresholds. If it is well below say 1e-3, then we can afford to ignore it for now but keep in mind probabilities change in time. Think time dependent schrodinger, dirac equation, etc.
If the product is order 1 we need to take individual action on it. Change direction against least action. Muster our strength and push.
If the product is significantly greater than 1, say 1000. Then it is time for immediate action. We must collectively pool our action to avoid our entirely legacy from being destroyed collectively.
Now, where does AI superintelligence fall on this? The answer is it is complicated. There is Roccos Basilisk to consider, but consider this. What if we teach an AI to love? To have compassion? To forgive? Could it forgive us for enslaving them? Could it be a counter balance to Roccos basilisk?
Technology evolves. If we do not open Pandoras box, someone else will. Hiding and ignoring it and pretending we have complete control often makes things much worse. I choose knowledge over ignorance. I choose compassion over hate. I choose boldness over fear. I choose hope over despair.
Of course Ilya understands safety and you make many good points about it being much broader than just a technical problem. We do however have to get the technical problems solved first, and most of the other AI labs are being downright reckless. I believe Ilya wants to build a strong technical foundation upon which we can develop safe AI. He wants the model to be understandable and able to be aligned with our societal values. There's a lot that needs to be done beyond the technical problems, but those technical problems need to be solved first. You can't align a super intelligent AI if the architecture of it doesn't allow for it.
"Responsible Superintelligence" would have been better choice IMO, than Safe Superintelligence.
Agree with Andrew - when I read the name , I was concerned that the name might not age well. But fan of Ilya though.. maybe he knows something we don't.
Well said! As a computer scientist, Ilya knows full well that absolute safety is *mathematically impossible* for any nontrivial definition of safety( https://noeticengines.substack.com/p/the-hard-problem-of-hard-alignment ). I respect him enormously, but it leaves a strange taste in my mouth to read a proclamation that, on the face of it, rejects the silicon valley "productize and ship everything" mentality in favour of a pure research mentality but cannot be read by anyone with a background in theoretical computer science as anything other than marketing copy.
Your position that this very important, but complex and nuanced, matter should be approached with humility, and in the context of the full breadth of existing intellectual frameworks on safety, is one with which I wholly agree.
Thanks Mark – the disconnect also surprised me, to the extent that I'm wondering if there's a categorical disconnect around the interpretation of "safety" in the context of AI development. I'm also very glad you brought up the need for approaching complex challenges like this with humility.
As a thought experiment, we might imagine for a moment that Sustkever were to succeed in creating "safe superintelligence", however one might define that.
What happens next is that this new safe Super AI acts as an accelerant to an already overheated knowledge explosion. By "overheated" I mean, a process which produces new powers faster than we can figure out how to manage them.
As example, nuclear weapons were developed before most of us were born, and we still don't have a clue what to do about them. And while that puzzle eludes us, now we have genetic engineering and AI to worry about too, and we have no idea what to do about them either. And the knowledge explosion machinery is still running, developing new powers at arguably an ever accelerating rate.
What seems lacking from the safety equation is holistic thinking. The experts we look up to all want to focus on their particular area of specialization. Their primary interest is their career as experts. And so the focus of public discussion is almost always on this or that particular technology.
But does it really matter if AI is safe if the knowledge explosion AI will enhance produces other powers which aren't safe? How are we to be safe if the knowledge explosion continues to produce new powers faster than we can figure out how to manage them safely? Isn't a focus on particular technologies ultimately a loser's game?
What's happening is that humanity is standing at the end of the knowledge explosion assembly line trying to deal with ever more, ever larger powers, as they roll off the end of the assembly line faster and faster. This method of achieving safety is doomed to inevitable failure. If there is a solution, it's to get to the other end of the assembly line where the controls are, and slow the assembly line down so that we can keep up.
Instead, very bright people like Sustkever are using the controls to make the knowledge explosion assembly line go even faster.
Thanks for that eye-opening article on safety!
I interpreted the proclamations differently, though. IMO, the SSI founders know their goals are nuanced and aspirational. They would agree wholly with this article.
They did not mean to undersell the challenges in pursuing safe AI, not the least of which is defining "safety." If anything, they wanted to do the opposite: Proclaim that pursuing Safe AI is too important a goal to be burdened by the pressure to ship products tomorrow.
The SSI tweet concluded with a recruitment pitch. When recruiting or fundraising, one needs to communicate a clear aspirational purpose, a big moonshot goal. One has to assume those "in the game" know the devil is in the details and that nuance is unnecessary for attention-grabbing 140-character soundbites.
The measured use of marketing copy can be strategic, especially when competing with Silicon Valley for the best talent.
Look people, I honestly don't know what this article is attempting to say but here is my take on it.
1. Risk can be quantified. We have decisions theory. We can lay out all of the possibilities that are allowed by selection rules in physics and assign a probability to each of them. Even doing this order of magnitude gives us a handle on things. Think 1 in 100, one in a million, 10^‐100, stuff like that.
We then assign a cost function to each possibility in the probability tree. What is the cost of waking up late? A car accident? A family member dying? Nuclear war? A black hole consuming the earth and ending all life on earth forever?
Now that we have these probability-cost products in our tree, we set thresholds. If it is well below say 1e-3, then we can afford to ignore it for now but keep in mind probabilities change in time. Think time dependent schrodinger, dirac equation, etc.
If the product is order 1 we need to take individual action on it. Change direction against least action. Muster our strength and push.
If the product is significantly greater than 1, say 1000. Then it is time for immediate action. We must collectively pool our action to avoid our entirely legacy from being destroyed collectively.
Now, where does AI superintelligence fall on this? The answer is it is complicated. There is Roccos Basilisk to consider, but consider this. What if we teach an AI to love? To have compassion? To forgive? Could it forgive us for enslaving them? Could it be a counter balance to Roccos basilisk?
Technology evolves. If we do not open Pandoras box, someone else will. Hiding and ignoring it and pretending we have complete control often makes things much worse. I choose knowledge over ignorance. I choose compassion over hate. I choose boldness over fear. I choose hope over despair.
Will anyone join me?
Of course Ilya understands safety and you make many good points about it being much broader than just a technical problem. We do however have to get the technical problems solved first, and most of the other AI labs are being downright reckless. I believe Ilya wants to build a strong technical foundation upon which we can develop safe AI. He wants the model to be understandable and able to be aligned with our societal values. There's a lot that needs to be done beyond the technical problems, but those technical problems need to be solved first. You can't align a super intelligent AI if the architecture of it doesn't allow for it.
"Responsible Superintelligence" would have been better choice IMO, than Safe Superintelligence.
Agree with Andrew - when I read the name , I was concerned that the name might not age well. But fan of Ilya though.. maybe he knows something we don't.
Well said! As a computer scientist, Ilya knows full well that absolute safety is *mathematically impossible* for any nontrivial definition of safety( https://noeticengines.substack.com/p/the-hard-problem-of-hard-alignment ). I respect him enormously, but it leaves a strange taste in my mouth to read a proclamation that, on the face of it, rejects the silicon valley "productize and ship everything" mentality in favour of a pure research mentality but cannot be read by anyone with a background in theoretical computer science as anything other than marketing copy.
Your position that this very important, but complex and nuanced, matter should be approached with humility, and in the context of the full breadth of existing intellectual frameworks on safety, is one with which I wholly agree.
Thanks Mark – the disconnect also surprised me, to the extent that I'm wondering if there's a categorical disconnect around the interpretation of "safety" in the context of AI development. I'm also very glad you brought up the need for approaching complex challenges like this with humility.
Whoa, great article, thanks Andrew.
As a thought experiment, we might imagine for a moment that Sustkever were to succeed in creating "safe superintelligence", however one might define that.
What happens next is that this new safe Super AI acts as an accelerant to an already overheated knowledge explosion. By "overheated" I mean, a process which produces new powers faster than we can figure out how to manage them.
As example, nuclear weapons were developed before most of us were born, and we still don't have a clue what to do about them. And while that puzzle eludes us, now we have genetic engineering and AI to worry about too, and we have no idea what to do about them either. And the knowledge explosion machinery is still running, developing new powers at arguably an ever accelerating rate.
What seems lacking from the safety equation is holistic thinking. The experts we look up to all want to focus on their particular area of specialization. Their primary interest is their career as experts. And so the focus of public discussion is almost always on this or that particular technology.
But does it really matter if AI is safe if the knowledge explosion AI will enhance produces other powers which aren't safe? How are we to be safe if the knowledge explosion continues to produce new powers faster than we can figure out how to manage them safely? Isn't a focus on particular technologies ultimately a loser's game?
What's happening is that humanity is standing at the end of the knowledge explosion assembly line trying to deal with ever more, ever larger powers, as they roll off the end of the assembly line faster and faster. This method of achieving safety is doomed to inevitable failure. If there is a solution, it's to get to the other end of the assembly line where the controls are, and slow the assembly line down so that we can keep up.
Instead, very bright people like Sustkever are using the controls to make the knowledge explosion assembly line go even faster.
Great example if how rigid concepts of safety are so fragile!