AI Safety Policies

It’s encouraging to already see some benefits of the Bletchley Summit, even before it’s happened. In Rishi Sunak’s speech on AI, he spoke of how the UK was prepared to lead the way on AI safety, including by being a trusted overseer and partner to tech firms. To that end it’s encouraging to see six of the major AI companies publish their safety policies.

Amazon, Anthropic, Google DeepMind, Meta, Microsoft and OpenAI have all published their safety polices, you can read them here. These were requested ahead of the summit, covering nine areas of AI safety policy, listed below:

  • Responsible Capability Scaling provides a framework for managing risk as organisations scale the capability of frontier AI systems, enabling companies to prepare for potential future, more dangerous AI risks before they occur
  • Model Evaluations and Red Teaming can help assess the risks AI models pose and inform better decisions about training, securing, and deploying them
  • Model Reporting and Information Sharing increases government visibility into frontier AI development and deployment and enables users to make well-informed choices about whether and how to use AI systems
  • Security Controls Including Securing Model Weights are key underpinnings for the safety of an AI system
  • Reporting Structure for Vulnerabilities enables outsiders to identify safety and security issues in an AI system
  • Identifiers of AI-generated Material provide additional information about whether content has been AI generated or modified, helping to prevent the creation and distribution of deceptive AI-generated content
  • Prioritising Research on Risks Posed by AI will help identify and address the emerging risks posed by frontier AI
  • Preventing and Monitoring Model Misuse is important as, once deployed, AI systems can be intentionally misused for harmful outcomes
  • Data Input Controls and Audits can help identify and remove training data likely to increase the dangerous capabilities their frontier AI systems possess, and the risks they pose

Why invite China?

Chinese FlagIn the past, a public intervention by a former UK Prime Minister might be considered big news, but these days it seems to happen most months. This time it was the turn of Liz Truss on the topic of AI. Truss has largely been ridiculed by the UK media; the shortest serving PM, accused of crashing the economy.

However, Truss’ views shouldn’t be dismissed too readily. She has held posts in government since 2014, most recently holding the office of Foreign Secretary and Prime Minister. The rise of AI, alongside ethical and geopoltical concerns around China, will have been areas on which she will have been well briefed.

In her letter to Rishi Sunak, she asks for China’s invitation to the Bletchley AI Summit to be rescinded, linking this to the early decision in 2020 to ban the use of Huawei equipment in UK 5G infrastructure. She writes:

We should be working with our allies, not those seeking to subvert freedom and democracy.

I’m not sure if Rishi Sunak has responded directly, but he alluded to the decision in his recent speech on AI. The key phrase “AI doesn’t respect borders” expresses the need for a global cooperation, Sunak responds here:

And yes, we’ve invited China. I know there are some who will say they should have been excluded. But there can be no serious strategy for AI, without at least trying to engage all of the world’s leading AI powers. That might not have been the easy thing to do, but it was the right thing to do.

Sunak is right that AI doesn’t respect borders, that there needs to be global collaboration. Given how high the stakes are with existential risk and AI warfare it would be foolish for any nation to unilaterally agree red lines and fall behind, while equally not wanting to fuel an AI ‘arms race’, global consensus is vital.

China is rightly critiqued for its human rights abuses and its posturing in the South China Sea, but we can’t just assume that they’ve got it completely wrong on AI. We perhaps would immediately recoil from Chinese tech ideas like the ‘Social Credit System‘ (perhaps we think of Black Mirror: Nosedive), but maybe they make more sense in a Shame/Honour culture. They may have very different red lines, and different ethical frameworks to govern their approach. It’s arogant for us to assume ‘West is Best’, so perhaps before we critique others we need to do further work to define our own ethics; where do they come from, where can they flex and where must they not bend, what are our red lines – and why do they matter? Perhaps we need a local ethical conversation before we go global…

Sunak on AI

Next week the UK will host an AI Safety Summit in Bletchley. In preparation Rishi Sunak delivered a speech this week to set the scene.

Here are some highlights:

  • The UK will fund and create an Institute for AI Safety as well as further investment in super and quantum computing.
  • Capabilities and risks from frontier AI‘ has been released as a discussion paper
  • Despite criticism, China has been invited to join the summit – global risks need all the players at the table.

The main appeal of the speech is one of honesty – being upfront about the risks, but also hopeful of the opportunities.

Book Review: Life 3.0

On the recommendation of Elon Musk no less, I picked up a copy of Life 3.0, by Max Tegmark, first published in 2017. Tegmark is Professor at MIT and co-founder of the Future of Life Institute. Whilst this book is primarily about AI, Tegmark is a cosmologist by trade, an interest which is apparent throughout the book.

Life 3.0 Book CoverThe book is largely scenario driven, lots of “what ifs” – at first sight Tegmark appears speculative at best, living up to his nickname “Mad Max”. But the scenarios are creatively written leaving us with many more questions, but opening up the world of Artificial Intelligence. Tegmark is able to vividly paint a best-case utopian vision, whilst being clear in his appeal to heed the very real risks.

Tegmark is kind to the non-technical reader, offering some simple (as far as possible) explanations for the foundations of this discussion; how machines and particularly computation works, memory and a quick introduction to machine learning, LLMs and neural networks.

Tegmark argues persuasively for a longtermism approach and the moral imperatives to get AI right. I have to confess rather skipping chapter 6 – “The Next Billion Years and Beyond” – I live in a country where longtermism would be to actually deliver a 20-year infrastructure project, so I’m happy to limit my thinking to decades and centuries for now! But you might still find this chapter interesting if the world of theoretical cosmology floats your boat, as Stuart Russell endorses – “For sheer science fun, it’s hard to beat.”

Chapter 3, ‘The near future: breakthroughs, bugs, laws, weapons and jobs’, will certainly feel the most useful to assess the current moment. It charts some of the progress in AI over recent decades and where we might feasibly be in the next few years and decades. This section provides much excitement around the potential of AI to do good and lot of pauses for thought and deeper concerns around ethical and existential issues. It perhaps leaves us with more questions than answers – would it be a good thing for AI to replace jobs? Is it too late to stop bad actors developing AI into weapons of war? Though, personally speaking it’s encouraging that Tegmark lists ‘clergy member’ as a ‘safe job’, unlikely to be replaced by AI any time soon!

The later discussion around goals, adoption, alignment and the ethical implications of these feels especially relevant to my own research and wider reading. There is I think a right appeal to ‘goodness’ as a key principle around AI ethics, and Tegmark is right to caution us around the subjectivity of this word, both in previous history and the potential for diveregence in the future. He graps at various principles that have helped humanity across the centuries to define ‘good’ – platonic forms, Golden Rules, and The Ten Commandments. Tegmark distills the ethical wisdom of the ages into four principles; Utilitarianism, Diversity, Autonomy and Legacy.

There’s lots of wisdom here and it’s written non-dogmatically, encouraging the reader to think, explore and ask further questions. Personally I think Tegmark’s ethical approach misses the mark on two counts. Firstly, it doesn’t attempt to engage thoroughly with the wisdom of the ages eg. why has the Golden Rule stood the test of time, why does it cross cultures and generations, how could it be applied to the field of AI? Secondly, I’m not sure that Tegmark has really grasped the first part of the book’s subtitle – ‘Being Human’. Throughout the book he describes humans as evolved animals, a ‘historical accident’ and appeals more to the ways in which machines can be ‘like us’ without every really defining who we are as humans, or what makes us unique. For someone who is concerned about the existential threat of AI it would be it would be more comforting for AI developers to have a more speciest approach to humanity!

To be fair to Tegmark, he rather concedes that his ethical framework for humanity is incomplete – “What’s ‘meaning’? What’s ‘life’? What’s the ultimate ethical imperative?… This makes it timely to rekindle the classic debates of philiosophy and ethics, and adds a new urgency to the conversation!” Perhaps this is the point, for all the novelty, excitement and opportunity that AI might afford, might we need to rekindle some of the classic arts of philosophy, ethics and dare we say, theology?

His final (main) chapter on consciousness is an honest approach to a very difficult topic, yet with humility and scientific enquiry he seeks to explore with us. Though he does accuse some others of approaching the question with ‘anthropological bias’. This is again where I feel Tegmark misses his point, when considering questions of humanity and indeed of existential dilemma for humanity, it’s ok to be a little anthropocentric and to value ‘human exceptionalism’ because I don’t think most people are ready, or happy to concede the human and artificial ‘consciousness’ are equivalent in value.

Notwithstanding my not insignificant concerns with some of Tegmark’s approach, this book is very readable and thought provoking. It’s particularly good because overall it sets out a postitive vision (without being naive) and is personal in tone – you feel like you’re sitting in the seminar room, chatting it through with Max.

Pause AI

There were some rather large protests taking place in London in recent weeks highlighting the current conflict in Gaza. So you could be forgiven for missing the rather smaller “Pause AI” protest.

What are they protesting about?

They are highlighting the present and future risks of AI; fake media, bias, economic instability, weapons, hacking and an existential risk.

What are they asking for?

They’re asking for a complete pause on training AI systems beyond GPT-4. Not indefinitely, but until such a time as safeguards and global standards can be put in place – before these systems are created. They’ve layed out their proposal here – even a step such as enforcing copyright in training LLMs could help to bring about a natural pause.

Various groups have been commissioning surveys to support their proposals; in the UK it’s reported that 74% of the public want to slow down the progress on AI.

Would a ‘pause’ work, or is the ‘genie out of the bottle’? Is there any hope of getting global agreement on these issues? Will the upcoming Bletchley AI Summit make any progess on these questions?


This week was the start of the UK Labour Party conference 2023. To coincide a deepfake audio recording of Labour leader Keir Starmer was released with the purpose of discrediting and embarrasing. Last year, deepfake videos of Zelensky and Putin were released claiming to carry political messages.

There have also been recent online scams involving deepfakes of Mr. Beast and Martin Lewis (Moneysavingexpert). There is clearly a market already to use deepfake technology to; influence politics, carry war propaganda and to steal money. It’s not just a danger to those tricked, but it likely tarnishes the reputation of the person who has been faked.

Deepfakes are clearly a dangerous tool and should probably be banned.

Yet, there might be some possible ‘good’ uses for the technology:

  • Parody – we don’t want to lose imitation as a comedy tool
  • Learning – training videos using your face, or your teacher’s face might help you learn (according to Bath Uni research)
  • Bringing history to life – why not use as a tool to recreate historic speeches?

Back in 1999 they used CGI to complete Oliver Reed’s final performance in Gladiator – how much better would AI do that today? Personally I’m a fan of resurrecting the voice of James Alexander Gordon to read the football scores.

These subjects are not easy – clearly the potential for abuse needs to be curbed, and yet it’s a technology with potential for good. Perhaps this is also the way out of incovenient ‘hot-mic moments’ – just say it was a deepfake!

Human Trust

Here’s a wide ranging interview with philosopher Daniel C Dennett. In discussing his memoir, the interview takes a detour to talk AI. Dennett freely admits he is an alarmist, but worries there are plenty of causes for alarm:

The most pressing problem is not that they’re going to take our jobs, not that they’re going to change warfare, but that they’re going to destroy human trust.

Trust is of course hard to build and easy to break. Do we trust AI developers? Do we trust governments or business in their AI safeguards and ethics? Might our trust be lost when we realise we’re speaking to an agent and not a human?

Dennett develops his argument further in his recent piece for The Atlantic – ‘The Problem with Counterfeit People.’ He not only notes the dangers of uncontrolled AI, but considers how the collapse of trust will damage society.

He proposes the outlawing of AI being allowed to impersonate a human (noting the irony of the early aims of AI – to beat the ‘Turing Test’). To safeguard this he recommends AI being ‘water-marked’ as such, so that it is unable to impersonate.

I have lots of sympathy with Dennett’s proposal though I would want to ponder:

  • Are there any ethical uses of AI where it is broadly beneficial for AI to at least mimic and model some humanity eg. in healthcare or education where that sense of ‘humanity’ is helpful to the task
  • Given his worldview, I take it that Dennett approvingly references Dawkin’s ‘The Selfish Gene’ – given that, why should we be overly protective as a species? Why not let AI evolve?

AI and IP

Marks and Clerk, the intellectual property specialists have just released their 2023 AI Report.

It contains some fascinating stats around patent filing for AI projects around the world:

  • US leads the way alongside Europe, but China is catching up (slowly)
  • South Korea has the most applications per capita
  • MedTech / Life Sciences is the sector with the most applications
  • Computer Vision is a primary area of research

But alongside the stats they begin to wrestle with some deeper questions:

  • Can AI be an inventor?
  • Similarly, can AI generated programs be copyrighted?
  • If an AI agent is learning from copyrighted material, are its results in breach of copyright?

The ethics around generative AI, particularly in the creation of music and images is likely to generate legal proceedings around who owns the results, especially if it can be proved where the models have been doing their learning.

For instance, here is a short report about Getty Images suing Stability AI – there will be some fascinating legal cases, and precendents set in law to resolve some of these questions.

Who’s in control?

Brian J. A. Boyd writing in The New Atlantic asks some thoughtful questions about the effect of AI on society and especially the work place:

“B.S. Jobs and the Coming Crisis of Meaning”

He writes around the area of Autonomous AI – the kind of systems that take the human operator / decision maker out of the process, systems that just get on with the task, and ultimately might come to replace human performed jobs.

If jobs are replaced by AI it poses all sorts of questions:

  • Is that a good thing?
  • What does that say about the value of those jobs?
  • Which jobs have already put the machine in charge?
  • What does that all say about the value of the people?

Of course there are big economic and societal considerations that stem from these questions – not least, what work is for…

There are clearly potential benefits, but Boyd’s reasonable fear is that AI becomes the middle-class of society – a ruling elite who set the rules, AI the decision maker, and a new underclass who serve the system. Doesn’t sound like progress or utopia to me!

The creation of perfect AI servants, if embedded in social structures with roles designed to maximize profit or sustain oligarchy, may bring about not a broad social empowerment but a “servile state,” formalizing the subjugation of an underclass to those who control the means of production

Red Lines

In their ‘AI Pause‘ discussion, Tegmark, Russell and Dempsey are all clear on the need for red lines in AI. It might sound prudish, but perhaps one of the first red lines should be the creation of nudity images.

I dare say you might think differently on this topic if you were the parent of a teenage girl in the small Spanish town of Almendralejo. Over the summer, 20 girls aged 11-17 had their photos taken off social media sites and AI-generated naked images created with them.

The suspects? A group of local boys aged 12-14.

Now, how do we feel about some red lines?

AI-generated naked child images shock Spanish town of Almendralejo (BBC)