incentive-methods

Techniques that place an AI in an environment where it has instrumental reasons to act in ways aligned with the principal's interests.

1 chapter across 1 book

Superintelligence: Paths, Dangers, Strategies (2014)Nick Bostrom

Chapter 10). Another concern is that it might encourage a false sense of security, though this is avoidable if we regard physical confinement as icing on the cake rather than the main substance of our precautions.

This chapter discusses the challenges and limitations of controlling a superintelligent AI through physical and informational containment ('boxing') and incentive methods. It highlights the risks of relying on human gatekeepers, the difficulty of fully isolating an AI, and the complexities of motivating AI behavior through rewards and social integration, emphasizing that these methods are not foolproof and require careful design and calibration. The chapter also explores the potential for AI to manipulate observers and the importance of aligning AI final goals with human interests to prevent undesirable outcomes.