incentive-methods
Techniques that place an AI in an environment where it has instrumental reasons to act in ways aligned with the principal's interests.
1 chapter across 1 book
Superintelligence: Paths, Dangers, Strategies (2014)Nick Bostrom
This chapter discusses the challenges and limitations of controlling a superintelligent AI through physical and informational containment ('boxing') and incentive methods. It highlights the risks of relying on human gatekeepers, the difficulty of fully isolating an AI, and the complexities of motivating AI behavior through rewards and social integration, emphasizing that these methods are not foolproof and require careful design and calibration. The chapter also explores the potential for AI to manipulate observers and the importance of aligning AI final goals with human interests to prevent undesirable outcomes.