List of Datasets for Automatic Speech Recognition (ASR) and Text To Speech Synthesis (TTS)

Published on: Dec 20, 2021 Latest update: Dec 20, 2021

This list contains datasets aimed at both ASR (sometimes called STT) and TTS. Rule of thumb: ASR and TTS are interchangable if done carefully

AudioMNIST
- spoken digits (0 - 9) by 60 different speakers

Common Voice
- provide samples for various languages

FSDD (Free Spoken Digit Dataset)
- spoken digits by 6 speakers

LJ Speech

MS-SNSD (Microscoft Scalable Noisy Speech Dataset)

OpenSLR Datasets
- famous for LibriSpeech and LibriTTS

Parkinson Speech Dataset

Thorsten Voice