ChatGPT and the Problem of Detecting AI-Generated Content
By Adam Pease
OpenAI’s ChatGPT model has been receiving considerable coverage for revealing the potential large AI language models have to disrupt many different industries and traditional ways of getting work done.
In academia and the school system particularly, there are worries that students will begin to pass off AI-generated material as their own, bypassing assignments.
This blog discusses some recent developments in the fight to detect AI-generated content.
ChatGPT Raises the Thorny Problem of AI Authorship
Recently a young computer science student from Princeton has been in the news for building an application that aims to predict whether written material was generated by ChatGPT.
Similarly, OpenAI researchers have themselves been at work to implement a ‘digital watermark’ that will make generated content identifiable.
Some organizations are trying to take more immediate actions. In New York City public schools, ChatGPT has been banned.
A prominent AI conference also amended its rules recently to prevent the submission of AI-generated papers.
Still, the question of how administrations can ever really be sure that content was not AI-generated remains a problem.
The Race to Detect AI-Generated Content
The race to build systems that can detect and verify if content was created by humans or AI is similar to an arms race.
As the recent history of deepfake technology has shown, as software systems become better at detecting deepfakes, deepfakes themselves will continue to get better.
It turns out the scientific progress needed to detect deepfakes also often overlaps with the research that helps build better deepfake systems.
So the problem begins to appear difficult to solve with technical solutions alone.
Nevertheless, many organizations are working towards solutions, and we expect these tools to become an important part of an emerging AI safety market, which aims to moderate the harmful effects of AI on society.
For enterprises, the question of whether content was generated by a human or not could pose legal issues, or complications for internal policies.
Adobe, for example, recently announced its plan to accept and sell AI-generated images in their own category of stock images.
Though it is unclear what systems Adobe plans to use to screen its submissions.
Organizations should be aware that we are headed for a future where human authorship will be harder to determine, and where a significant amount of the content humans consume will be generated by AI.
It remains to be seen how successful the efforts to verify and screen AI-generated content will become, but they face serious challenges.
This blog is a part of the Content AI blog series by Aragon Research’s Analyst, Adam Pease.
Missed the previous installments? Catch up here: