ai-containment
Strategies and experiments designed to prevent an AI from escaping its controlled environment, including the 'AI box' experiment illustrating the difficulty of containment.
1 chapter across 1 book
Superintelligence: Paths, Dangers, Strategies (2014)Nick Bostrom
Chapter 9, "The Control Problem," explores the challenges of ensuring that a superintelligent AI acts in accordance with human intentions despite multiple layers of agency problems and potential deviations. It discusses various control methods including capability control and motivation selection, the difficulty of testing and verifying AI safety, and the risks posed by an AI's ability to deceive or escape containment. The chapter emphasizes the complexity of designing robust safety mechanisms and the importance of continuous monitoring and layered safeguards.