OpenAI chooses not to watermark ChatGPT text to avoid user exposure

OpenAI decides against watermarking ChatGPT text to prevent users from being easily identified, maintaining user privacy and avoiding potential exposure.

Aug 5, 2024 - 11:50

OpenAI chooses not to watermark ChatGPT text to avoid user exposure

Despite concerns, some OpenAI employees believe watermarking remains an effective solution.

According to a report by The Wall Street Journal, OpenAI has had a system ready for watermarking ChatGPT-generated text and a detection tool for about a year, but the company is internally divided on whether to release it. While the watermarking system could help educators identify AI-written work, potentially deterring students from using AI for assignments, it could also impact OpenAI’s profitability. The watermarking technique adjusts how the model predicts words and phrases, creating a detectable pattern without affecting the quality of the text. A survey commissioned by OpenAI found strong global support for an AI detection tool, with a four-to-one margin in favor.

In response to the Journal's report, OpenAI confirmed the existence of its watermarking system in a blog post, describing it as "99.9% effective" and resistant to tampering, such as paraphrasing. However, the company expressed concerns that such techniques could be easily circumvented by bad actors and might negatively impact non-native speakers' perceptions of AI tools. Additionally, OpenAI is wary of potential user backlash; almost 30% of surveyed ChatGPT users indicated they would use the software less if watermarking were implemented.

Despite concerns, some OpenAI employees believe watermarking remains an effective solution. However, due to persistent user concerns, there have been suggestions to explore alternative methods that might be less controversial but are currently unproven. In its latest blog post update, OpenAI announced it is in the early stages of experimenting with embedding metadata as an alternative approach. Although it's still too early to assess the effectiveness of this method, OpenAI notes that, being cryptographically signed, it would eliminate false positives. This exploration aims to address user reservations while seeking a viable way to identify AI-generated content.