AI Safety Policies - Thinking AI

It’s encouraging to already see some benefits of the Bletchley Summit, even before it’s happened. In Rishi Sunak’s speech on AI, he spoke of how the UK was prepared to lead the way on AI safety, including by being a trusted overseer and partner to tech firms. To that end it’s encouraging to see six of the major AI companies publish their safety policies.

Amazon, Anthropic, Google DeepMind, Meta, Microsoft and OpenAI have all published their safety polices, you can read them here. These were requested ahead of the summit, covering nine areas of AI safety policy, listed below:

Responsible Capability Scaling provides a framework for managing risk as organisations scale the capability of frontier AI systems, enabling companies to prepare for potential future, more dangerous AI risks before they occur
Model Evaluations and Red Teaming can help assess the risks AI models pose and inform better decisions about training, securing, and deploying them
Model Reporting and Information Sharing increases government visibility into frontier AI development and deployment and enables users to make well-informed choices about whether and how to use AI systems
Security Controls Including Securing Model Weights are key underpinnings for the safety of an AI system
Reporting Structure for Vulnerabilities enables outsiders to identify safety and security issues in an AI system
Identifiers of AI-generated Material provide additional information about whether content has been AI generated or modified, helping to prevent the creation and distribution of deceptive AI-generated content
Prioritising Research on Risks Posed by AI will help identify and address the emerging risks posed by frontier AI
Preventing and Monitoring Model Misuse is important as, once deployed, AI systems can be intentionally misused for harmful outcomes
Data Input Controls and Audits can help identify and remove training data likely to increase the dangerous capabilities their frontier AI systems possess, and the risks they pose

Leave a Reply Cancel reply