value-loading-problem
The challenge of encoding human values into an AI's motivation system so it reliably pursues those values as final goals.
1 chapter across 1 book
Superintelligence: Paths, Dangers, Strategies (2014)Nick Bostrom
Chapter 12 of Bostrom's Superintelligence addresses the value-loading problem: how to instill human values into a superintelligent AI so that it pursues those values as its final goals. The chapter explains the complexity of encoding human values into utility functions, critiques approaches like evolutionary selection and reinforcement learning, and emphasizes the necessity of solving this problem before the AI becomes too intelligent to modify. It highlights the difficulty of specifying comprehensive, safe, and meaningful utility functions and warns against simplistic or dangerous methods that might lead to unintended consequences.