technical-safety
The implementation of design constraints and rigorous testing to reduce harmful outputs and biases in AI systems.
1 chapter across 1 book
The Coming Wave: Technology, Power, and the Twenty-first Century's Greatest Dilemma (2023)Mustafa Suleyman; Michael Bhaskar
Chapter 14 outlines a multi-layered approach to containing and ensuring the safety of advanced technologies, particularly AI, starting from technical fixes and expanding to broader societal and regulatory measures. It emphasizes the rapid progress made in reducing biases in large language models through reinforcement learning from human feedback and advocates for a large-scale, well-funded 'Apollo program' dedicated to AI and biosafety research. The chapter also discusses the importance of physical containment, technical safety standards, explainability, uncertainty management, and the development of AI systems that can monitor and improve other AIs to prevent harmful outcomes.