Hi, I'm Paul, a researcher working on AI safety.
Recently, I've built an AI agent that can autonomously build AI control evaluations environments which means we can now scale the production of this critical safety infrastructure. Over the last year I had been manually building evaluations for organizations such as the UK AI Security Institute and Anthropic.
Before this, I worked on risk modelling for UK AI Security Institute, independent research related to goal detection via interpretability, and other AI safety related projects. See this page for some old blog posts on this work.
Prior to working on AI safety, I did a PhD in mathematics.
Aside from the above, I'm interested in Tibetan Buddhism, mathematics, the technological singularity, consciousness, understanding utopias, and other such topics.
For fun, I enjoy learning, exploring cities & nature, dancing, spending time with friends & family, watching films, meditation & relaxation, and (contemporary) art.
Contact: firstname.lastname@gmail.com