Digital Tool Spots Academic Text Spawned by ChatGPT with 99% Accuracy

Heather Desaire, a chemist who uses machine learning in biomedical research at the University of Kansas, has unveiled a new tool that detects with 99% accuracy scientific text generated by ChatGPT, the artificial intelligence text generator.

The peer-reviewed journal Cell Reports Physical Science published research showing the efficacy of her AI-detection method, along with sufficient source code for others to replicate the tool.

Desaire, the Keith D. Wilner Chair in Chemistry at KU, said accurate AI-detection tools urgently are required to defend scientific integrity.

"ChatGPT and all other AI text generators like it make up facts," she said. "In academic science publishing - writings about new discoveries and the edge of human knowledge - we really can’t afford to pollute the literature with believable-sounding falsehoods. They’d unavoidably make their way into publications if AI text generators are commonly used. As far as I’m aware, there’s no foolproof way to, in an automated fashion, find those ‘hallucinations’ as they’re called. Once you start populating real scientific facts with made-up AI nonsense that sounds perfectly believable, those publications are going to become less trustable, less valuable."

She said the success of her detection method depends on narrowing the scope of writing under scrutiny to scientific writing of the kind found commonly in peer-reviewed journals. This improves accuracy over existing AI-detection tools, like the RoBERTa detector, which aim to detect AI in more general writing.

"You can easily build a method to distinguish human from ChatGPT writing that is highly accurate, given the trade-off that you’re restricting yourself to considering a particular group of humans who write in a particular way," Desaire said. "Existing AI detectors are typically designed as general tools to be leveraged on any kind of writing. They are useful for their intended purpose, but on any specific kind of writing, they’re not going to be as accurate as a tool built for that specific and narrow purpose."

Desaire said university instructors, grant-giving entities and publishers all require a precise way to detect AI output presented as work from a human mind.

"When you start to think about 'AI plagiarism,' 90% accurate isn’t good enough," Desaire said. "You can't go around accusing people of surreptitiously using AI and be frequently wrong in those accusations - accuracy is critical. But to get accuracy, the trade-off is most often generalizability."

Desaire's coauthors were all from her KU research group: Romana Jarosova, research assistant professor of chemistry at KU; David Huax, information systems analyst; and graduate students Aleesa E. Chua and Madeline Isom.

Desaire and her team’s success at detecting AI text may stem from the high level of human insight (versus machine-learning pattern detection) that went into devising the code.

"We used a much smaller dataset and much more human intervention to identify the key differences for our detector to focus on," Desaire said. "To be exact, we built our strategy using just 64 human-written documents and 128 AI documents as our training data. This is maybe 100,000 times smaller than the size of data sets used to train other detectors. People often gloss over numbers. But 100,000 times - that's the difference between the cost of a cup of coffee and a house. So, we had this small data set, which could be processed super quickly, and all the documents could actually be read by people. We used our human brains to find useful differences in the document sets, we didn't rely on the strategies to differentiate humans and AI that had been developed previously."

Indeed, the KU researcher said the group built their approach without relying on the strategies in past approaches to AI detection. The resulting technique has elements completely unique to the field of AI text detection.

"I'm a little embarrassed to admit this, but we didn’t even consult the literature on AI text detection until after we had a working tool of our own in hand," Desaire said. "We were doing this not based on how computer scientists think about text detection, but instead using our intuition about what would work."

In another important aspect, Desaire and her group flipped the script on methods used by previous teams building AI-detection methods.

"We didn’t make the AI text the focus when developing the key features," she said. "We made the human text the focus. Most researchers building their AI detectors seem to ask themselves, 'What does AI-generated text look like?' We asked, 'What does this unique group of human writing look like, and how is it different from AI texts?' Ultimately, AI writing is human writing since the AI generators are built with large repositories of human writing that they piece together. But AI writing, from ChatGPT at least, is generalized human writing drawn from a variety of sources.

"Scientists' writing is not generalized human writing. It's scientists’ writing. And we scientists are a very special group."

Desaire has made her team’s AI-detecting code fully accessible to researchers interested in building off it. She hopes others will realize that AI and AI detection are within reach of people who might not consider themselves computer programmers now.

"ChatGPT is really such a radical advance, and it has been adopted so quickly by so many people, this seems like an inflection point in our reliance on AI," she said. "But the reality is, with some guidance and effort, a high school student could do what we did.

"There are huge opportunities for people to get involved in AI, even if they don’t have a computer-science degree. None of the authors on our manuscript have degrees in computer science. One outcome I would like to see from this work is that people who are interested in AI will know the barriers to developing real and useful products, like ours, aren’t that high. With a little knowledge and some creativity, a lot of people can contribute to this field."

Heather Desaire, Aleesa E. Chua, Madeline Isom, Romana Jarosova, David Hua.
Distinguishing academic science writing from humans or ChatGPT with over 99% accuracy using off-the-shelf machine learning tools.
Cell Reports Physical Science, 2023. doi: 10.1016/j.xcrp.2023.101426

Most Popular Now

Philips and Medtronic Advocacy Partnersh…

Royal Philips (NYSE: PHG, AEX: PHIA), a global leader in health technology, and Medtronic Neurovascular, a leading innovator in neurovascular therapies, today announced a strategic advocacy partnership. Delivering timely stroke...

Wearable Cameras Allow AI to Detect Medi…

A team of researchers says it has developed the first wearable camera system that, with the help of artificial intelligence (AI), detects potential errors in medication delivery. In a test whose...

New AI Tool Predicts Protein-Protein Int…

Scientists from Cleveland Clinic and Cornell University have designed a publicly-available software and web database to break down barriers to identifying key protein-protein interactions to treat with medication. The computational tool...

AI for Real-Rime, Patient-Focused Insigh…

A picture may be worth a thousand words, but still... they both have a lot of work to do to catch up to BiomedGPT. Covered recently in the prestigious journal Nature...

New Research Shows Promise and Limitatio…

Published in JAMA Network Open, a collaborative team of researchers from the University of Minnesota Medical School, Stanford University, Beth Israel Deaconess Medical Center and the University of Virginia studied...

G-Cloud 14 Makes it Easier for NHS to Bu…

NHS organisations will be able to save valuable time and resource in the procurement of technologies that can make a significant difference to patient experience, in the latest iteration of...

Start-Ups will Once Again Have a Starrin…

11 - 14 November 2024, Düsseldorf, Germany. The finalists in the 16th Healthcare Innovation World Cup and the 13th MEDICA START-UP COMPETITION have advanced from around 550 candidates based in 62...

Hampshire Emergency Departments Digitise…

Emergency departments in three hospitals across Hampshire Hospitals NHS Foundation Trust have deployed Alcidion's Miya Emergency, digitising paper processes, saving clinical teams time, automating tasks, and providing trust-wide visibility of...

MEDICA HEALTH IT FORUM: Success in Maste…

11 - 14 November 2024, Düsseldorf, Germany. How can innovations help to master the great challenges and demands with which healthcare is confronted across international borders? This central question will be...

A "Chemical ChatGPT" for New M…

Researchers from the University of Bonn have trained an AI process to predict potential active ingredients with special properties. Therefore, they derived a chemical language model - a kind of...

Siemens Healthineers co-leads EU Project…

Siemens Healthineers is joining forces with more than 20 industry and public partners, including seven leading stroke hospitals, to improve stroke management for patients all over Europe. With a total...

MEDICA and COMPAMED 2024: Shining a Ligh…

11 - 14 November 2024, Düsseldorf, Germany. Christian Grosser, Director Health & Medical Technologies, is looking forward to events getting under way: "From next Monday to Thursday, we will once again...