Study identifies the best generative AI text detectors

Tools are improving, but are not yet reliable enough to enforce AI labelling

Study identifies the best generative AI text detectors

Image:
Study identifies the best generative AI text detectors

Benchmark results for evaluating machine-generated text detectors have just been released in a new study.

The May 2024 study, RAID: A Shared Benchmark for Robust Evaluation of Machine-Generated Text Detectors, comes from a collaboration between the University of Pennsylvania, the University College London, King's College London, and Carnegie Mellon University.

The study's authors say they have presented the "the largest and most challenging benchmark dataset for machine-generated text detection."

The researchers evaluated 12 leading AI detectors in three categories:

The study found that Binoculars performed "impressively well across models even at extremely low false positive rates," Originality "achieved high precision in some constrained scenarios," and GPTZero was "unusually robust to adversarial attacks."

Testing was performed against 11 advanced text generation models, including ChatGPT. The research spanned eight diverse domains of text, challenging detectors with 11 types of sophisticated adversarial attacks. The dataset encompassed 6,287,820 text records.

The methodology applied a 5% "false positive threshold across all tests to ensure a balanced measure of precision and recall, critical for real-world applicability."

"As the generation capabilities of language models have continued to increase, accurately and automatically detecting machine-generated text has become an important priority. Detection efforts have even surpassed the bounds of natural language processing research, spurring discussions by social media companies and governments on possibly mandating labels for machine-generated content," the researchers said.

However, using such tools to enforce such labelling would be difficult, they went on.

"Despite the protective intentions of these mandates, our work shows that such regulations would be difficult to enforce even if they were implemented. Detectors are not yet robust enough for widespread deployment or high-stakes use: many detectors we tested are nearly inoperable at low false positive rates, fail to generalise to alternative decoding strategies or repetition penalties, show clear bias towards certain models and domains, and quickly degrade with simple black-box adversarial attacks."

Originality.AI issued a news release about the study's findings of its AI-detection technology, saying it had "achieved an outstanding 85% accuracy, outperforming the nearest competitor, which scored 80%," was rated a "top performer on adversarial datasets," "led the field in five out of eight content domains," and that its "detection capabilities were particularly notable in identifying paraphrased content, achieving a remarkable 96.7% accuracy compared to an average of 59% among other detectors."

"The RAID study represents a significant milestone in the field of AI detection. It provides a robust framework for evaluating the efficacy of AI detectors, ensuring that they meet the highest standards of accuracy and reliability," Originality.AI said.

AI generated text and images have become a concern especially in certain sectors like academia. One study found that that AI tools such as ChatGPT were likely used to assist with a significant number of research papers published in 2023, mostly without proper disclosure.

Another hot-button issue surrounds the use of AI-generated content in political ads, particularly the use of deep fakes – images and video that often perfectly mimic a politician or other public figure. Several US states have proposed or enacted legislation on deep fakes, including Arizona, California, Hawaii, Massachusetts and others.

However, AI detectors are not without controversy. According to a blog post on Binoculars' site, "the use of LLM detectors such as Binoculars also raises important ethical issues … While they can help protect against disinformation and preserve the authenticity of information, there is a risk that they could be misused or have unintended negative effects. For example, texts written by non-native speakers could be misclassified as machine-generated."

A 2023 study also found that AI detectors are biased against non-native English writers.

This article was first published on MES Computing.