Corpus of Australian and New Zealand Spoken English now available via federated login

The Corpus of Australian and New Zealand Spoken English is a 195-million-word corpus of geolocated automatic speech recognition transcripts of video content from local governments in Australia and New Zealand, created for the study of lexical, grammatical, phonetic, and discourse-pragmatic phenomena of spoken language. CoANZSE Audio contains, in addition to the complete textual content of the corpus, audio files and forced alignments in Praat’s TextGrid format for most transcripts.


To access the corpus, log in via the CLARIN Service Provider Federation with your institution’s credentials.