listen-attend-spell The Listen Attend and Spell architecture is a deep neural network, designed to jointly learn to align and transcribe speech data.