ChatGPT and Other AI-generated Text Detectors

Views: 0

Checking if text is AI-generated or not is often used by educators to detect the use of large language models and generative AI. Here are some of the best tools for detecting AI-generated text and how they work.


How do AI Detectors work?

An AI detector works by looking for patterns in the text that are indicative of AI-generated text. For example, a chatbot may say a certain word or phrase more often than a human would. Or, they might be more reluctant to use the first-person pronoun "I" or "we". They might make less typos and they might be more repetitive. Each chatbot has its own unique patterns that can be detected by an AI detector. Just like how each person has their own writing style, each chatbot has its own writing style.

What are strategies for avoiding detection by AI Detectors?

Though there are no fool-proof strategies, people have found many strategies for avoiding detection by AI checkers:

  • Ask the chatbot to give you the text in a specific style, e.g. "in a clear, concise manner with a passionate and enthusiastic tone". This will make the generated text more unique and less discernibly AI-generated.
  • Introduce small typos and grammatical errors into the text.
  • Proofread the text and change phrases and sentences that sound too repetitive or unnatural.
  • Ask different chatbots to generate different sections of the text, then combine them together. Or, ask different chatbots to paraphrase different sections of the text.
  • Provide the chatbot with sources that are relevant to the topic you want it to write about. This will make the facts the AI generates more accurate and less likely to be detected as AI-generated.

What is the academic literature on generative AI text detection?

Researchers have found that it's impossible to detect and watermark AI-generated text with 100% accuracy1. This is because as AI-generated text gets really good, it becomes indistiguishable from human-generated text. However, we aren't quite there yet, so there are still some good algorithms for detecting AI-generated text.

  • Computer Scientists at the University of Washington and the Allen Institute, found that they could detect fake news with 92% accuracy with access to the model used to generate the fake news and that generally available model could detect fake news with 73% accuracy2.

  • Visualizations, developed at MIT and Harvard, help humans build intuition of what is AI-generated and what is not, improving human accuracy of detecting AI-generated text to 72%3.

  • Stanford researchers invented a tool called DetectGPT that can detect AI-generated text with extremely high accuracy but acknowledge that it's much harder to detect AI-generated text when you don't know what model the text is from. In this case, the tool is only for GPT generated text, with is the common model used by OpenAI and others 4.

What are the best freely available tools for detecting AI-generated text?

Scribbr: Free AI Content Detector

Designed to detect text by ChatGPT3.5, GPT4, and Google Bard, Scribbr is a free tool that can detect AI-generated text quickly and easily with a simple copy and paste. There is maximum of 500 words per check, but you can check as many passages as you want.

(Loading)

https://www.scribbr.com/ai-detector/

ZeroGPT: Advanced and Reliable Chat GPT, GPT4 & AI Content Detector

ZeroGPT can detect AI-generated text from both a copy-paste and a file upload. It has a nice feature where it highlights which text it suspects is most likely written by AI and has a maximum of 15,000 characters per check, supporting up to 100,000 characters on their paid version.

(Loading)

https://www.zerogpt.com/

GPTZero

GPTZero is a free tool with a clean interface that can detect AI-generated text from a copy-paste or a file upload of .pdf, .docx, or .txt files. It has a maximum of 5,000 characters per check. GPTZero gives nice visualizations of the text, highlighting which text it suspects is most likely written by AI.

(Loading)

https://gptzero.me/

References

Footnotes

  1. Zhang, H., Edelman, B. L., Francati, D., Venturi, D., Ateniese, G., & Barak, B. (2023). Watermarks in the Sand: Impossibility of Strong Watermarking for Generative Models (Version 2). arXiv. https://doi.org/10.48550/ARXIV.2311.04378

  2. Zellers, R., Holtzman, A., Rashkin, H., Bisk, Y., Farhadi, A., Roesner, F., & Choi, Y. (2019). Defending Against Neural Fake News (Version 3). arXiv. https://doi.org/10.48550/ARXIV.1905.12616

  3. Gehrmann, S., Strobelt, H., & Rush, A. M. (2019). GLTR: Statistical Detection and Visualization of Generated Text (Version 1). arXiv. https://doi.org/10.48550/ARXIV.1906.04043

  4. Mitchell, E., Lee, Y., Khazatsky, A., Manning, C. D., & Finn, C. (2023). DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature (Version 2). arXiv. https://doi.org/10.48550/ARXIV.2301.11305