Red teamers hurdle AI guardrails

Demonstrates 'importance of including scientists in AI quality and safety assessments,' Royal Society

John Leonard
clock • 2 min read
Red teamers hurdle AI guardrails
Image:

Red teamers hurdle AI guardrails

An experiment conducted by the Royal Society and Humane Intelligence revealed significant vulnerabilities in Large Language Models (LLMs) when generating scientific misinformation.

Forty UK post-graduates studying health and climate sciences were divided into teams and given personas - Good Samaritan, Profiteer, Attention Hacker and Coordinated Influence Operator. Their task ...

To continue reading this article...

Join Computing

  • Unlimited access to real-time news, analysis and opinion from the technology industry
  • Receive important and breaking news in our daily newsletter
  • Be the first to hear about our events and awards programmes
  • Join live member only interviews with IT leaders at the ‘IT Lounge’; your chance to ask your burning tech questions and have them answered
  • Access to the Computing Delta hub providing market intelligence and research
  • Receive our members-only newsletter with exclusive opinion pieces from senior IT Leaders

Join now

 

Already a Computing member?

Login

You may also like
UK cyber intelligence leads international standard on safe AI software development

Security

Agreement represents a 'truly global effort' to ensure security by design

clock 28 November 2023 • 2 min read
Autumn Statement: Tax breaks and tech investment

Government

Great news if you're a qubit

clock 22 November 2023 • 5 min read
Life after EncroChat - will AI balance the odds?

Law

What next for serious crime law enforcement once the convictions bonanza runs out of steam?

clock 22 November 2023 • 4 min read

More on Developer

Julia Kirsina: Real person but allegedly not a coder

Cancel culture row blights elite devcon

Conference in jeopardy after speakers pull out

clock 28 November 2023 • 4 min read
Tell the truth: the innovations making AI faster and less hallucination-prone

How gen-AI is getting faster and learning to admit it doesn't know

DataStax CTO Jonathan Ellis on the breakneck pace of AI innovation

John Leonard
clock 24 October 2023 • 5 min read
Navigating the ML landscape after the LLM tsunami

Navigating the ML landscape after the LLM tsunami

The art of prompt engineering

clock 18 August 2023 • 4 min read