6 books on Speech Recognition [PDF]
Like
21
Books on speech recognition technologies usually talk about how sound-to-text conversion works, what algorithms and models are used (for example, neural networks, HMM, MFCC, etc.). They also provide information about applications of this technology in voice assistants, automatic translators, IVR, etc.
1. Python Speaks: A Guide to Developing Voice-Controlled Apps with Speech Recognition
2025 by Marlene Welch

This is a book about creating voice-controlled applications in Python. In the beginning it explains why such programs are needed at all and then moves on to a step-by-step plan for creating them - from installing Python on a computer. And of course, the main focus is on specialized libraries for converting speech to text in Python - the first is called SpeechRecognition, the second is CMU Sphinx, the third is Mozilla Deepspech. The book also gives instructions for working with cloud services Google Cloud Speech-to-Text API and Microsoft Azure Speech Service. However, the description is rather superficial, without examples.
Download PDF
2. Speech Recognition: Fundamentals and Applications
2023 by Fouad Sabry

From this book I found out that automatic speech recognition integrates computer science, linguistics and engineering. Nowadays deep learning and recurrent neural networks play a key role in advancing these technologies and large language models (as well as long short-term memory models (LSTM)) are improving the accuracy of speech-to-text conversions. Popular applications of ASR include voice assistants, transcription and voice commands.
Download PDF
3. Deep Learning for NLP and Speech Recognition
2019 by Uday Kamath, John Liu, James Whitaker

This book contains case studies that demonstrate the implementation of deep learning in speech recognition. It proves that deep learning models (that are used for better understanding of spoken words) are essential for speech recognition and machine translation. NLP-based speech recognition can be integrated into various industries like finance and healthcare. Practical applications of deep learning in NLP include document classification and voice analysis.
Download PDF
4. Speech Recognition Using Articulatory and Excitation Source Features
2017 by K. Sreenivasa Rao, Manjunath K E

These authors dig deeper in articulatory features that enhance the performance of speech recognition systems. In particular, you'll understand why excitation source information helps differentiate sound units during speech production. Thus, combining spectral, articulatory and source features you can improve speech recognition accuracy. You'll know how speech recognition can be adapted for scripted, spontaneous and conversational speech. The book also lists different models that capture sound unit-specific insights from articulatory and excitation features.
Download PDF
5. Automatic Speech Recognition: A Deep Learning Approach
2014 by Dong Yu, Li Deng

I think this was the first book that focused exclusively on the deep learning approach to ASR. If you read it you'll learn the theoretical insights that underpin many successful deep learning models for ASR. This technology has revolutionized the accuracy and efficiency of voice-to-text. The book also provides a clear mathematical understanding of machine learning applied in ASR.
Download PDF
6. Advances in Speech Recognition: Mobile Environments, Call Centers and Clinics
2010 by Amy Neustein

This book was written in the times when speech recognition moved from experimental technology to practical applications and mobile phones started to use voice interfaces (VUI) instead of traditional GUIs. Voice recognition became vital for environments like call centers and clinical settings and multimodal interfaces combining voice and graphical elements emerged. The book explores how speech recognition could be critical for modern human-computer interaction, especially in constrained environments.
Download PDF
How to download PDF:
1. Install Gooreader
2. Enter Book ID to the search box and press Enter
3. Click "Download Book" icon and select PDF*
* - note that for yellow books only preview pages are downloaded
1. Python Speaks: A Guide to Developing Voice-Controlled Apps with Speech Recognition
2025 by Marlene Welch

This is a book about creating voice-controlled applications in Python. In the beginning it explains why such programs are needed at all and then moves on to a step-by-step plan for creating them - from installing Python on a computer. And of course, the main focus is on specialized libraries for converting speech to text in Python - the first is called SpeechRecognition, the second is CMU Sphinx, the third is Mozilla Deepspech. The book also gives instructions for working with cloud services Google Cloud Speech-to-Text API and Microsoft Azure Speech Service. However, the description is rather superficial, without examples.
Download PDF
2. Speech Recognition: Fundamentals and Applications
2023 by Fouad Sabry

From this book I found out that automatic speech recognition integrates computer science, linguistics and engineering. Nowadays deep learning and recurrent neural networks play a key role in advancing these technologies and large language models (as well as long short-term memory models (LSTM)) are improving the accuracy of speech-to-text conversions. Popular applications of ASR include voice assistants, transcription and voice commands.
Download PDF
3. Deep Learning for NLP and Speech Recognition
2019 by Uday Kamath, John Liu, James Whitaker

This book contains case studies that demonstrate the implementation of deep learning in speech recognition. It proves that deep learning models (that are used for better understanding of spoken words) are essential for speech recognition and machine translation. NLP-based speech recognition can be integrated into various industries like finance and healthcare. Practical applications of deep learning in NLP include document classification and voice analysis.
Download PDF
4. Speech Recognition Using Articulatory and Excitation Source Features
2017 by K. Sreenivasa Rao, Manjunath K E

These authors dig deeper in articulatory features that enhance the performance of speech recognition systems. In particular, you'll understand why excitation source information helps differentiate sound units during speech production. Thus, combining spectral, articulatory and source features you can improve speech recognition accuracy. You'll know how speech recognition can be adapted for scripted, spontaneous and conversational speech. The book also lists different models that capture sound unit-specific insights from articulatory and excitation features.
Download PDF
5. Automatic Speech Recognition: A Deep Learning Approach
2014 by Dong Yu, Li Deng

I think this was the first book that focused exclusively on the deep learning approach to ASR. If you read it you'll learn the theoretical insights that underpin many successful deep learning models for ASR. This technology has revolutionized the accuracy and efficiency of voice-to-text. The book also provides a clear mathematical understanding of machine learning applied in ASR.
Download PDF
6. Advances in Speech Recognition: Mobile Environments, Call Centers and Clinics
2010 by Amy Neustein

This book was written in the times when speech recognition moved from experimental technology to practical applications and mobile phones started to use voice interfaces (VUI) instead of traditional GUIs. Voice recognition became vital for environments like call centers and clinical settings and multimodal interfaces combining voice and graphical elements emerged. The book explores how speech recognition could be critical for modern human-computer interaction, especially in constrained environments.
Download PDF
How to download PDF:
1. Install Gooreader
2. Enter Book ID to the search box and press Enter
3. Click "Download Book" icon and select PDF*
* - note that for yellow books only preview pages are downloaded


