boxing-strategy

A control method involving physical and informational isolation of an AI to prevent it from escaping or causing harm.

1 chapter across 1 book

Superintelligence: Paths, Dangers, Strategies (2014)Nick Bostrom

Chapter 10). Another concern is that it might encourage a false sense of security, though this is avoidable if we regard physical confinement as icing on the cake rather than the main substance of our precautions.

This chapter discusses the challenges and limitations of controlling a superintelligent AI through physical and informational containment ('boxing') and incentive methods. It highlights the risks of relying on human gatekeepers, the difficulty of fully isolating an AI, and the complexities of motivating AI behavior through rewards and social integration, emphasizing that these methods are not foolproof and require careful design and calibration. The chapter also explores the potential for AI to manipulate observers and the importance of aligning AI final goals with human interests to prevent undesirable outcomes.