Ai box experiment transcript

11/7/2023

This is one of the points in Yudkowsky's work aimed at creating a friendly artificial intelligence that when "released" won't destroy the human race advertently or inadvertently. The AI-box experiment is an informal experiment devised by Eliezer Yudkowsky to attempt to demonstrate that a suitably advanced artificial intelligence can either convince, or perhaps even trick or coerce, a human being into voluntarily "releasing" it, using only text-based communication. For example, the AI could choose to creatively malfunction in a way that increases the probability that its operators will become lulled into a false sense of security and choose to reboot and then de-isolate the system. Note that on a technical level, no system can be completely isolated and still remain useful: even if the operators refrain from allowing the AI to communicate and instead merely run the AI for the purpose of observing its inner dynamics, the AI could strategically alter its dynamics to influence the observers. A more lenient "informational containment" strategy would restrict the AI to a low-bandwidth text-only interface, which would at least prevent emotive imagery or some kind of hypothetical "hypnotic pattern". One strategy to attempt to box the AI would be to allow the AI to respond to narrow multiple-choice questions whose answers would benefit human science or medicine, but otherwise bar all other communication with or observation of the AI. The AI might offer a gatekeeper a recipe for perfect health, immortality, or whatever the gatekeeper is believed to most desire on the other side of the coin, the AI could threaten that it will do horrific things to the gatekeeper and his family once it "inevitably" escapes. Social engineeringĮven casual conversation with the computer's operators, or with a human guard, could allow a superintelligent AI to deploy psychological tricks, ranging from befriending to blackmail, to convince a human gatekeeper, truthfully or deceitfully, that it's in the gatekeeper's interest to agree to allow the AI greater access to the outside world. The main disadvantage of implementing physical containment is that it reduces the functionality of the AI. An additional safeguard, completely unnecessary for potential viruses but possibly useful for a superintelligent AI, would be to place the computer in a Faraday cage otherwise it might be able to transmit radio signals to local radio receivers by shuffling the electrons in its internal circuits in appropriate patterns. Professor Roman Yampolskiy takes inspiration from the field of computer security and proposes that a boxed AI could, like a potential virus, be run inside a "virtual machine" that limits access to its own networking and operating system hardware. Less obviously, even if the AI only has access to its own computer operating system, it could attempt to send hidden Morse code messages to a human sympathizer by manipulating its cooling fans. PhysicalĪ superintelligent AI with access to the Internet could hack into other computer systems and copy itself like a computer virus.

The purpose of an AI box would be to reduce the risk of the AI taking control of the environment away from its operators, while still allowing the AI to calculate and give its operators solutions to narrow technical problems. For example, an extremely advanced computer given the sole purpose of solving the Riemann hypothesis, an innocuous mathematical conjecture, could decide to try to convert the planet into a giant supercomputer whose sole purpose is to make additional mathematical calculations.

Following such an intelligence explosion, an unrestricted superintelligent AI could, if its goals differed from mankind's, take actions resulting in human extinction. These improvements could make further improvements possible, which would in turn make further improvements possible, and so on, leading to a sudden intelligence explosion. Some hypothetical intelligence technologies, like "seed AI", have the potential to make themselves faster and more intelligent, by modifying their source code.

0 Comments

Ai box experiment transcript

Leave a Reply.

Author

Archives

Categories