Limitations of watermarking AI-generated speech using AudioSeal

Loading...
Thumbnail Image
Files
2025384933.pdf(318.13 KB)
Accepted Version
Date
2025
Authors
Faziludeen, Shameer
Sankar, Arun
De Leon, Phillip L.
Roedig, Utz
Journal Title
Journal ISSN
Volume Title
Publisher
Institute of Electrical and Electronics Engineers (IEEE)
Published Version
Research Projects
Organizational Units
Journal Issue
Abstract
AI-generated speech is currently of such high quality that it is indistinguishable from a genuine human speaker. Expert listeners or purpose-built detectors are no longer able to reliably distinguish between the two. Thus, it has been proposed that AI systems which generate speech embed a secondary signal or watermark that allows identification. AudioSeal is currently the most advanced watermarking algorithm proposed for this purpose and its resilience against common channel and coding effects has been demonstrated. In this paper, we present approaches which compromise AudioSeal, making it unusable in practical settings. First, we describe two methods that result in a shifting of the detector score distribution for watermarked speech toward the distribution for unwatermarked speech. Second, we describe a method that uses AudioSeal watermarks generated for a particular speaker’s signal on a different speaker’s signal, i.e. unmatched watermarks. These unmatched watermarks, which could be imposed on genuine human speech, are also inaudible, resilient, and result in a shift of the detector score distribution away from unwatermarked speech. Considering both approaches, we observe that AudioSeal watermarks cannot be used to reliably identify AI-generated speech from genuine human speech due to overlapping score distributions. While our results are specific to AudioSeal, it casts doubt on the approach of watermarking in general to identify AI-generated speech.
Description
Keywords
Watermarking , AI generated speech , Deep fake , AudioSeal , Audio watermarking , Watermark attacks
Citation
Faziludeen, S., Sankar, A., De Leon, P. L. and Roedig, U. (2025) 'Limitations of watermarking AI-generated speech using AudioSeal', 7th IEEE International Conference on Trust, Privacy and Security in Intelligent Systems, and Applications (TPS 2025), Pittsburgh, PA, USA, 11-14 November 2025.
Link to publisher’s version