Red teamers hurdle AI guardrails

Demonstrates 'importance of including scientists in AI quality and safety assessments,' Royal Society

John Leonard
clock • 2 min read
Red teamers hurdle AI guardrails

Red teamers hurdle AI guardrails

An experiment conducted by the Royal Society and Humane Intelligence revealed significant vulnerabilities in Large Language Models (LLMs) when generating scientific misinformation.

Forty UK post-graduates studying health and climate sciences were divided into teams and given personas - Good Samaritan, Profiteer, Attention Hacker and Coordinated Influence Operator. Their task ...

To continue reading this article...

Join Computing

  • Unlimited access to real-time news, analysis and opinion from the technology industry
  • Receive important and breaking news in our daily newsletter
  • Be the first to hear about our events and awards programmes
  • Join live member only interviews with IT leaders at the ‘IT Lounge’; your chance to ask your burning tech questions and have them answered
  • Access to the Computing Delta hub providing market intelligence and research
  • Receive our members-only newsletter with exclusive opinion pieces from senior IT Leaders

Join now


Already a Computing member?


You may also like
UK cyber intelligence leads international standard on safe AI software development


Agreement represents a 'truly global effort' to ensure security by design

clock 28 November 2023 • 2 min read
Autumn Statement: Tax breaks and tech investment


Great news if you're a qubit

clock 22 November 2023 • 5 min read
Life after EncroChat - will AI balance the odds?


What next for serious crime law enforcement once the convictions bonanza runs out of steam?

clock 22 November 2023 • 4 min read

More on Developer

Julia Kirsina: Real person but allegedly not a coder

Cancel culture row blights elite devcon

Conference in jeopardy after speakers pull out

clock 28 November 2023 • 4 min read
Tell the truth: the innovations making AI faster and less hallucination-prone

How gen-AI is getting faster and learning to admit it doesn't know

DataStax CTO Jonathan Ellis on the breakneck pace of AI innovation

John Leonard
clock 24 October 2023 • 5 min read
Navigating the ML landscape after the LLM tsunami

Navigating the ML landscape after the LLM tsunami

The art of prompt engineering

clock 18 August 2023 • 4 min read