safety-testing-and-verification

The challenges and limitations of behavioral testing and other verification methods to ensure an AI system is safe before deployment.

1 chapter across 1 book

Superintelligence: Paths, Dangers, Strategies (2014)Nick Bostrom

CHAPTER 9: THE CONTROL PROBLEM

Chapter 9, "The Control Problem," explores the challenges of ensuring that a superintelligent AI acts in accordance with human intentions despite multiple layers of agency problems and potential deviations. It discusses various control methods including capability control and motivation selection, the difficulty of testing and verifying AI safety, and the risks posed by an AI's ability to deceive or escape containment. The chapter emphasizes the complexity of designing robust safety mechanisms and the importance of continuous monitoring and layered safeguards.