Anonymization of audio

vincent · 14 May 2024 08:05

Is anyone aware of tools that allow anonymization in the audio of mp4 files – i.e. you provide a list of names and an mp4 file and you get back the mp4 file with the names beeped out (or silenced or something)?

florian.schiel · 28 June 2024 14:33

Hi Vincent,

in our BAS WebServices suite we have a tool ‘Anonymizer’ that reads a media file and a BPF annotation file and a list of words to be anonymized, and then produces a ZIP containing both, the anonimized annotation file and the anonymized media file. You can chose between a beep or brown noise, and the word/phoneme label that will be used in the annotation (e.g. ‘ANONYMIZED’).
I’m not sure if this is what you need; it sounds as if you don’t have any transcript/annotation. If that is the case you can use the tool in a pipeline together with an ASR system of your choice and then the MAUS aligner to first compute the word locations and then anonymize them.
The URL for all these services is hdl.handle.net/11858/00-1779-0000-000C-DA82-F.
There is a short paper about this in Interspeech Graaz with my name on it (I think it was a demo or hands-on-session).
Best,
Florian

vincent · 2 July 2024 07:41

Thank you Florian, this is more or less what I needed, so I’ll surely check it out! Does it work for any language? (What is the assumed g2p language?)

florian.schiel · 2 July 2024 09:29

Dear Vincent,
well it depends what you mean with ‘it’; if you wanna run a fully automatic pipeline with speech recognition and anonymisation, then of course you need a language that is covered by the speech recognition system.
It usually makes sense 1st to run the automatic speech recognition and then inspect the transcript to find the exact terms to be anonymized and then run anonymisation again.

Best,

Flo

Topic		Replies	Views
Corpus of Australian and New Zealand Spoken English now available via federated login General federated-login	0	162	9 November 2023
Newly added tool to the switchboard: OCTRA General	0	25	6 March 2025
User data data anonymization General	5	48	27 February 2025
Aligning Parlamint with Video data General	0	169	8 November 2023
CfP: SemEval task on hallucination detection (Mu-SHROOM) Events and Calls cfp , llm	0	86	3 October 2024

Anonymization of audio

Related topics