Position on AI and AI Research: the only wise approach is to halt development of more powerful AI indefinitely #403
Replies: 7 comments 14 replies
-
Eliezer Piece in Timehttps://time.com/6266923/ai-eliezer-yudkowsky-open-letter-not-enough/ This 6-month moratorium would be better than no moratorium. I have respect for everyone who stepped up and signed it. It’s an improvement on the margin. I refrained from signing because I think the letter is understating the seriousness of the situation and asking for too little to solve it. The key issue is not “human-competitive” intelligence (as the open letter puts it); it’s what happens after AI gets to smarter-than-human intelligence. Key thresholds there may not be obvious, we definitely can’t calculate in advance what happens when, and it currently seems imaginable that a research lab would cross critical lines without noticing. Many researchers steeped in these issues, including myself, expect that the most likely result of building a superhumanly smart AI, under anything remotely like the current circumstances, is that literally everyone on Earth will die. Not as in “maybe possibly some remote chance,” but as in “that is the obvious thing that would happen.” It’s not that you can’t, in principle, survive creating something much smarter than you; it’s that it would require precision and preparation and new scientific insights, and probably not having AI systems composed of giant inscrutable arrays of fractional numbers. Without that precision and preparation, the most likely outcome is AI that does not do what we want, and does not care for us nor for sentient life in general. That kind of caring is something that could in principle be imbued into an AI but we are not ready and do not currently know how. Absent that caring, we get “the AI does not love you, nor does it hate you, and you are made of atoms it can use for something else.” The likely result of humanity facing down an opposed superhuman intelligence is a total loss. Valid metaphors include “a 10-year-old trying to play chess against Stockfish 15”, “the 11th century trying to fight the 21st century,” and “Australopithecus trying to fight Homo sapiens“. To visualize a hostile superhuman AI, don’t imagine a lifeless book-smart thinker dwelling inside the internet and sending ill-intentioned emails. Visualize an entire alien civilization, thinking at millions of times human speeds, initially confined to computers—in a world of creatures that are, from its perspective, very stupid and very slow. A sufficiently intelligent AI won’t stay confined to computers for long. In today’s world you can email DNA strings to laboratories that will produce proteins on demand, allowing an AI initially confined to the internet to build artificial life forms or bootstrap straight to postbiological molecular manufacturing. If somebody builds a too-powerful AI, under present conditions, I expect that every single member of the human species and all biological life on Earth dies shortly thereafter. There’s no proposed plan for how we could do any such thing and survive. OpenAI’s openly declared intention is to make some future AI do our AI alignment homework. Just hearing that this is the plan ought to be enough to get any sensible person to panic. The other leading AI lab, DeepMind, has no plan at all. An aside: None of this danger depends on whether or not AIs are or can be conscious; it’s intrinsic to the notion of powerful cognitive systems that optimize hard and calculate outputs that meet sufficiently complicated outcome criteria. With that said, I’d be remiss in my moral duties as a human if I didn’t also mention that we have no idea how to determine whether AI systems are aware of themselves—since we have no idea how to decode anything that goes on in the giant inscrutable arrays—and therefore we may at some point inadvertently create digital minds which are truly conscious and ought to have rights and shouldn’t be owned. The rule that most people aware of these issues would have endorsed 50 years earlier, was that if an AI system can speak fluently and says it’s self-aware and demands human rights, that ought to be a hard stop on people just casually owning that AI and using it past that point. We already blew past that old line in the sand. And that was probably correct; I agree that current AIs are probably just imitating talk of self-awareness from their training data. But I mark that, with how little insight we have into these systems’ internals, we do not actually know. If that’s our state of ignorance for GPT-4, and GPT-5 is the same size of giant capability step as from GPT-3 to GPT-4, I think we’ll no longer be able to justifiably say “probably not self-aware” if we let people make GPT-5s. It’ll just be “I don’t know; nobody knows.” If you can’t be sure whether you’re creating a self-aware AI, this is alarming not just because of the moral implications of the “self-aware” part, but because being unsure means you have no idea what you are doing and that is dangerous and you should stop. On Feb. 7, Satya Nadella, CEO of Microsoft, publicly gloated that the new Bing would make Google “come out and show that they can dance.” “I want people to know that we made them dance,” he said. This is not how the CEO of Microsoft talks in a sane world. It shows an overwhelming gap between how seriously we are taking the problem, and how seriously we needed to take the problem starting 30 years ago. We are not going to bridge that gap in six months. It took more than 60 years between when the notion of Artificial Intelligence was first proposed and studied, and for us to reach today’s capabilities. Solving safety of superhuman intelligence—not perfect safety, safety in the sense of “not killing literally everyone”—could very reasonably take at least half that long. And the thing about trying this with superhuman intelligence is that if you get that wrong on the first try, you do not get to learn from your mistakes, because you are dead. Humanity does not learn from the mistake and dust itself off and try again, as in other challenges we’ve overcome in our history, because we are all gone. Trying to get anything right on the first really critical try is an extraordinary ask, in science and in engineering. We are not coming in with anything like the approach that would be required to do it successfully. If we held anything in the nascent field of Artificial General Intelligence to the lesser standards of engineering rigor that apply to a bridge meant to carry a couple of thousand cars, the entire field would be shut down tomorrow. We are not prepared. We are not on course to be prepared in any reasonable time window. There is no plan. Progress in AI capabilities is running vastly, vastly ahead of progress in AI alignment or even progress in understanding what the hell is going on inside those systems. If we actually do this, we are all going to die. Many researchers working on these systems think that we’re plunging toward a catastrophe, with more of them daring to say it in private than in public; but they think that they can’t unilaterally stop the forward plunge, that others will go on even if they personally quit their jobs. And so they all think they might as well keep going. This is a stupid state of affairs, and an undignified way for Earth to die, and the rest of humanity ought to step in at this point and help the industry solve its collective action problem. Some of my friends have recently reported to me that when people outside the AI industry hear about extinction risk from Artificial General Intelligence for the first time, their reaction is “maybe we should not build AGI, then.” Hearing this gave me a tiny flash of hope, because it’s a simpler, more sensible, and frankly saner reaction than I’ve been hearing over the last 20 years of trying to get anyone in the industry to take things seriously. Anyone talking that sanely deserves to hear how bad the situation actually is, and not be told that a six-month moratorium is going to fix it. On March 16, my partner sent me this email. (She later gave me permission to excerpt it here.) “Nina lost a tooth! In the usual way that children do, not out of carelessness! Seeing GPT4 blow away those standardized tests on the same day that Nina hit a childhood milestone brought an emotional surge that swept me off my feet for a minute. It’s all going too fast. I worry that sharing this will heighten your own grief, but I’d rather be known to you than for each of us to suffer alone.” When the insider conversation is about the grief of seeing your daughter lose her first tooth, and thinking she’s not going to get a chance to grow up, I believe we are past the point of playing political chess about a six-month moratorium. If there was a plan for Earth to survive, if only we passed a six-month moratorium, I would back that plan. There isn’t any such plan. Here’s what would actually need to be done: The moratorium on new large training runs needs to be indefinite and worldwide. There can be no exceptions, including for governments or militaries. If the policy starts with the U.S., then China needs to see that the U.S. is not seeking an advantage but rather trying to prevent a horrifically dangerous technology which can have no true owner and which will kill everyone in the U.S. and in China and on Earth. If I had infinite freedom to write laws, I might carve out a single exception for AIs being trained solely to solve problems in biology and biotechnology, not trained on text from the internet, and not to the level where they start talking or planning; but if that was remotely complicating the issue I would immediately jettison that proposal and say to just shut it all down. Shut down all the large GPU clusters (the large computer farms where the most powerful AIs are refined). Shut down all the large training runs. Put a ceiling on how much computing power anyone is allowed to use in training an AI system, and move it downward over the coming years to compensate for more efficient training algorithms. No exceptions for governments and militaries. Make immediate multinational agreements to prevent the prohibited activities from moving elsewhere. Track all GPUs sold. If intelligence says that a country outside the agreement is building a GPU cluster, be less scared of a shooting conflict between nations than of the moratorium being violated; be willing to destroy a rogue datacenter by airstrike. Frame nothing as a conflict between national interests, have it clear that anyone talking of arms races is a fool. That we all live or die as one, in this, is not a policy but a fact of nature. Make it explicit in international diplomacy that preventing AI extinction scenarios is considered a priority above preventing a full nuclear exchange, and that allied nuclear countries are willing to run some risk of nuclear exchange if that’s what it takes to reduce the risk of large AI training runs. That’s the kind of policy change that would cause my partner and I to hold each other, and say to each other that a miracle happened, and now there’s a chance that maybe Nina will live. The sane people hearing about this for the first time and sensibly saying “maybe we should not” deserve to hear, honestly, what it would take to have that happen. And when your policy ask is that large, the only way it goes through is if policymakers realize that if they conduct business as usual, and do what’s politically easy, that means their own kids are going to die too. Shut it all down. We are not ready. We are not on track to be significantly readier in the foreseeable future. If we go ahead on this everyone will die, including children who did not choose this and did not do anything wrong. Shut it down. |
Beta Was this translation helpful? Give feedback.
-
Ooof I don't know what's bleaker, the message or the pushback on LessWrong |
Beta Was this translation helpful? Give feedback.
-
Thanks for sharing this, Rufus! Some quick thoughts: In " clear and present dangers of machine learning / AI", include the present ecological and human impact of mining, building, training, maintaining, and discarding AI systems. AI is often discussed as something non-material, but it's a global and physical entity. Very much appreciate that you write, "Does not need to involve "self-consciousness" to be an existential risk". Too often the focus is on AGI or consciousness, and this stretches us only into the future and away from the concrete effects of AI systems. In "Specifics", you might add certain areas where AI might be used (as in the Eliezer piece, where she would allow for biological research). Not sure where I stand on this... perhaps it's easier/clearer to say what AI should not do rather than what it should. Anyways, it's a point for further thinking! In the FAQs, you might include that alignment research often assumes that AI will exist and seeks a technological solution for something that may be best sorted out without tech. As a philosopher I think there's some exciting thinking to be said on tech-determinism, tech solutionism, and how temporality is treated in these discussions. Back to work for me! Thanks for the interesting ideas. |
Beta Was this translation helpful? Give feedback.
-
Been thinking of the difference that makes a difference a la G Bateson and am landing on becoming confident that only poetry, art, and humour, in the face of AI advances, will get us through. New interview today with EY : https://www.youtube.com/watch?v=AaTRHFaaPG8 |
Beta Was this translation helpful? Give feedback.
-
Q that comes to my mind is how possible is it to actually enforce this? How many GPU clusters already exist that could operate without scrutiny, and how easy would it be to set up a new cluster. I’m thinking governments, crypto miners, small-scale cloud providers, render farms, unknown entities etc. One risk would be that this move pushes model training underground rather than stops it. Non-proliferation of nuclear weapons seems like a much easier coordination problem given what it takes to produce a nuclear weapon versus train an AI model. Also I'm still not so clear on the risk vectors. Or is the point that there is not enough clarity here? I've only heard broad mentions of the AI-alignment problem. The vectors that come to mind for me (as a layperson) when I think about existential risk:
I guess 1-3 might all be different forms of AI alignment problems? It sounds like Eliezer Yudkowsky is mostly concerned with 3. But even if we managed to entirely solve AI alignment I can’t see how this solves the real issue in the long run: that AI could attain the capability (aligned or not) to wipe out the human race. Put another way – if AI can wipe out the human race due to misalignment, then surely it follows that it would also be able to wipe out the human race through perfectly aligned actions guided by a malevolent actor. And if that's true, it’s hard to see how this problem could be solved without somehow limiting the sphere of influence of any single entity (i.e. dismantling the networks that connect our world). I'd love to get some more clarity on this if anyone can point to relevant resources. |
Beta Was this translation helpful? Give feedback.
-
How relates to Should FallacySee stuff like regularly https://www.youtube.com/watch?v=ooczthpG0uY Note how it talks about whether a pause would be feasible not whether a pause is desirable. This shift, not only moves our focus away from the key question of what we should do, to how we could do it -- but also, in doing so, undermines the should. This framing - which i see regularly - is deeply confusing because it conflates two distinct and sequenced questions: Qu 1: should we do X e.g. have an AI pause By conflating the two parts you use concerns about two to undermine one which cleverly undermines action on two (because if people don't think you should do it, it is harder to get collective action to do it). Regarding (2) it is obvious that if everyone (or, even, a clear majority who were willing to use state powers to enforce this) thought we should have an AI pause then doing it would be pretty easy actually. |
Beta Was this translation helpful? Give feedback.
-
I think this is a thoughtful thread and much better than most of the discussions I've seen online about the subject. Let me provide some counterpoints to the above thesis. But perhaps might be missing a perspective that I think a great many software engineers hold. That LLM models are concerning arising out of their capacity to confuse us because they're in the uncanny valley of agency. We've built something that's a simulacrum of intelligence but isn't anywhere close to the end goal of actually being intelligent or having agency. The algorithms are feeding our words back to us in a way that is designed to be maximally plausible and convincing, but perhaps it doesn't get us anywhere close to the end state. I suggest the possibility that there is a gap in the technology that represents an incredibly large engineering problem that we will be grappling with for generations. Transformers are a very interesting technology, but I don't think they are as exponential or universal as some people have suggested. There are already tons of tools that are superhuman at what they do. Calculators do math far better than humans ever will. AlphaFold is better at protein folding than humans playing Foldit. Stockfish plays chess better than any human. The arc of history is that our tools will always exceed our capacities, which is progress. I would argue that LLMs are far more like these tools than some "master algorithm". They just happen to be superhuman at several specific tasks:
Text summarization is the banalest and non-controversial and where the companies will actually use this. Having Siri be able to pull a BBC article and summarize the main points is genuinely useful. The probabilistic recitation and pattern extrapolation found in things like Github Copilot is also genuinely useful. It can regurgitate large portions of Wikipedia and string together sentences that are sometimes accidentally true, but it's important to remember that it's not optimizing for truth or correctness ... it's optimizing for maximally plausible text. See below:
Obviously, the above article doesn't exist, but the model will confidently extrapolate from the title to what it probabilistically could be. And that goes to the third point: these Transformer models built to do predictive text completion are bullshit engines. And I don't mean that in a pejorative sense. I mean it in how it's defined in "On Bullshit" by Frankfurt. Bullshit is speech or communication that is "unconnected to a concern with the truth." It differs from lying in that the bullshitter is not necessarily concerned with whether its claims are true or false. We've just created this simulacrum of the bullshitter, which has no agency or intent and just blindly bullshits based on a vast corpus of internet text. We've created the AlphaFold or the Stockfish of bullshitting. Which is humorous and possibly useful, but I don't think really all that scary. It doesn't surprise me that, given a vast internet corpus to crib from, it will do quite well on things MBA exams and short response basic reasoning tests. But for physics, math, chemistry, or any goal-directed long-running tasks where output is highly contingent on the truth, I highly doubt this model will scale up to the frontiers of human knowledge. Need only play around with GitHub Copilot (which I use every day) for a while to get a good feeling for the hard limitations of these models. They're quite good at boilerplate generation but don't even have the context or memory to hold enough information to do changes on a small web app. If you were to ask it to do "write a faster matrix multiplication algorithm than LAPACK" it's a task that's vastly outside of its trained corpus and impossible for it to do. This becomes even more exasperated if you ask to prove a theorem at the edge of research in algebraic topology or something, where even the tiniest amount of bullshit or error would completely invalidate the proof. And that's really my issue with the "AI intelligence explosion" hypotheses because they're all predicated on the assumptions of the existence of massive self-improving leaps, and I struggle to see how such leaps arise out of these models. In their current form, these models are like a cross between a parrot, an encyclopedia, and a goldfish. They're blind stochastic recitation machines without agency, intent, or grounding to any base reality, and that's not going to get us very far in advancing scientific knowledge because science is completely antithetical to bullshit. Thus, I claim that these "intelligence explosion" and dystopian scenarios that some people are proposing are predicated on some specious logical leaps and assumptions that depend on properties that we don't observe in our systems today. My primary concern, and where I'm sympathetic to the manifesto proposed, is that humans combined with a superhuman bullshit engine might be a net negative for civilization. We saw what disinformation did in the social media era, and I don't see the phenomenon going away. Imagine a generative-AI conspiracy cult, and the possibilities are truly haunting. The other major harm is that vulnerable people truly begin to believe that these models have agency or personality and begin to relate to them as if they did, thus making them subjectable to manipulation. There might also be, as you mentioned, "trans-rational" reasons why we don't want to continue down this line of research. People could find these models simply "creepy," which might be a sufficient cause for certain democratic societies to ban their use if enough of the electorate share that perspective. However, I don't think these questions aren't attainable to reason. It is worth analyzing and questioning whether the assumptions behind the existential doom scenarios are logically well-formed before writing a manifesto based on the assumption that outcome is either possible or likely: Some sources I find compelling: |
Beta Was this translation helpful? Give feedback.
-
UPDATE 2023-04-14: posted an in progress position statement https://lifeitself.org/ai
I have held this position for many years but believed the prospect and risk was much more distant. However, developments in the last few years, and especially the last few months, have convinced me that the matter is much more pressing.
I'd be interested to hear others thoughts - including those with very different views!
In terms of a plan of action i believe we ultimately need a mass movement and political pressure to enact international restrictions and regulation. One of the first steps to a mass movement are some kind of manifesto and an endorsement thereof. This can be a seed for local movements which can then grow.
This is a huge opportunity for positive collective action. I am actually positive right now because it suddenly seems like a bunch of people have woken up to gravity of situation. Eg Eliezer was prominent in what are called alignment circles which I have long viewed as a distraction or worse from straight up collective action for restrictions. So I think there is movement here. I am posting because this is a moment to break out of resignation that I have seen in my discussions over last decade, especially in tech. Any suggestion that we should slow down or halt was met with the riposte that "oh that is impossible because someone else will do it": that is, a general resignation about collective political action that infects our whole culture. This is a moment to break out. 🤞
Draft of a Statement / Manifesto: Citizens for AI Restriction (CAIR)
We call for an immediate moratorium on advanced machine learning / AI developments [until such time as humanity has had sufficient time to determine a wise course of advance]
Why? clear and present dangers of machine learning / AI
Other points:
Specifics
General points for manifesto
FAQs
What about alignment efforts?
Alignment efforts mean work to ensure that AI systems have "positive" values and most importantly don't want to harm humans and humanity as a whole.
Alignment efforts may have some place in a "diversified" approach to AI de-risking but it must be very clear they are secondary.
Alignment efforts can become distractions or even actually harmful by legitimating AI accelerationists
Collective action isn't possible
Steel-man the thesis
Critique it
Point to successes in collective action
Computation:
https://ourworldindata.org/brief-history-of-AI
https://ourworldindata.org/grapher/artificial-intelligence-number-training-datapoints
https://ourworldindata.org/grapher/artificial-intelligence-training-computation
If we stop, won't the others do it (China if you are US, US if you are China etc)
This is a collective action problem.
Self-interest and bias of tech community
🚩 may be a bit too critical and end up losing allies etc.
Links to other pieces
Footnotes
the one obvious exception would be if AI was credibly and with very high probability going to help us avoid an equally immediate and significant existential threat. (The classic thought experiment would be we knew we were going to be attacked by a more powerful alien species intent on destroying earth and AI offered a plausible method of defeating them). However, there is no such remotely credible near-term other existential risk that stronger AI helps us avoid. ↩
The creation of GPT-3 is an incredible achievement. The LLM has 175 billion parameters, a record at the time of its release. But digesting vast amounts of text scraped from the internet takes some doing. And OpenAI required a custom-built supercomputer – hosted on Azure by its investment partner, Microsoft – to complete the task. The machine features a whopping 10,000 GPUs that had to whirr away for months, and consumed plenty of energy in the process. https://techhq.com/2023/03/llama-leak-mixed-blessing-for-facebook-ai/ ↩
Beta Was this translation helpful? Give feedback.
All reactions