Neural Text-To-Speech Synthesis

Tan, Xu

BookHardcover

Ranking30694in

CHF188.00

BookfromCHF188.00

E-bookCHF177.00

Add to basket

Description

Text-to-speech (TTS) aims to synthesize intelligible and natural speech based on the given text. It is a hot topic in language, speech, and machine learning research and has broad applications in industry. This book introduces neural network-based TTS in the era of deep learning, aiming to provide a good understanding of neural TTS, current research and applications, and the future research trend.This book first introduces the history of TTS technologies and overviews neural TTS, and provides preliminary knowledge on language and speech processing, neural networks and deep learning, and deep generative models. It then introduces neural TTS from the perspective of key components (text analyses, acoustic models, vocoders, and end-to-end models) and advanced topics (expressive and controllable, robust, model-efficient, and data-efficient TTS). It also points some future research directions and collects some resources related to TTS.This book is the first to introduceneural TTS in a comprehensive and easy-to-understand way and can serve both academic researchers and industry practitioners working on TTS.

More descriptions

Details

ISBN/GTIN978-981-99-0826-4

Product TypeBook

BindingHardcover

PublisherSpringer

Publishing date30/05/2023

Edition2023

SeriesArtificial Intelligence: Foundations, Theory, and Algorithms

Pages228 pages

LanguageEnglish

SizeWidth 160 mm, Height 241 mm, Thickness 18 mm

Weight512 g

Article no.45463420

CatalogsBuchzentrum

Data source no.43944833

More details

Series

Feature Selection for High-Dimensional Data

Bridging Constraint Satisfaction and Boolean Satisfiability

Hybrid Metaheuristics

Subjective Logic

Decision Diagrams for Optimization

Coordination of Complex Sociotechnical Systems

Bridging Constraint Satisfaction and Boolean Satisfiability

Subjective Logic

Coordination of Complex Sociotechnical Systems

Hybrid Metaheuristics

Responsible Artificial Intelligence

The Geometry of Uncertainty

Heterogeneous Graph Representation Learning and Applications

Machine Learning Safety

Foundation Models for Natural Language Processing

AI Ethics

Hypergraph Computation

Heterogeneous Graph Representation Learning and Applications

Neural Text-To-Speech Synthesis

Electronic Institutions

Machine Learning Safety

Neural Text-to-Speech Synthesis

Feature Selection for High-Dimensional Data

Towards a Code of Ethics for Artificial Intelligence

Decision Diagrams for Optimization

Towards a Code of Ethics for Artificial Intelligence

Author

Tan, Xu

Xu Tan is a Principal Researcher and Research Manager at Microsoft Research Asia. His research interests cover deep learning and its applications in language/speech/music processing and digital human creation. He has rich research experience in text-to-speech synthesis. He has developed high-quality TTS systems such as FastSpeech 1/2 (widely used in the TTS community), DelightfulTTS (winning the champion of the Blizzard TTS Challenge), and NaturalSpeech (achieving human-level quality on the TTS benchmark dataset), and transferred many research works to improve the experience of Microsoft Azure TTS services. He has given a series of tutorials on TTS at top conferences such as IJCAI, ICASSP, and INTERSPEECH, and written a comprehensive survey paper on TTS. Besides speech synthesis, he has designed several popular language models (e.g., MASS) and AI music systems (e.g., Muzic), developed machine translation systems that achieved human parity in Chinese-English translation and won several champions in WMT machine translation competitions. He has published over 100 papers at prestigious conferences such as ICML, NeurIPS, ICLR, AAAI, IJCAI, ACL, EMNLP, NAACL, ICASSP, INTERSPEECH, KDD, and IEEE/ACM Transactions, and served as the area chair or action editor of some AI conferences and journals (e.g., NeurIPS, AAAI, ICASSP, TMLR).

Please wait - the print view of the page is being prepared.

Neural Text-To-Speech Synthesis

Description

Details

Series

Feature Selection for High-Dimensional Data

Bridging Constraint Satisfaction and Boolean Satisfiability

Hybrid Metaheuristics

Subjective Logic

Decision Diagrams for Optimization

Coordination of Complex Sociotechnical Systems

Bridging Constraint Satisfaction and Boolean Satisfiability

Subjective Logic

Coordination of Complex Sociotechnical Systems

Hybrid Metaheuristics

Responsible Artificial Intelligence

Responsible Artificial Intelligence

The Geometry of Uncertainty

The Geometry of Uncertainty

Heterogeneous Graph Representation Learning and Applications

Machine Learning Safety

Foundation Models for Natural Language Processing

AI Ethics

Hypergraph Computation

Heterogeneous Graph Representation Learning and Applications

Neural Text-To-Speech Synthesis

Electronic Institutions

Machine Learning Safety

Neural Text-to-Speech Synthesis

Feature Selection for High-Dimensional Data

Towards a Code of Ethics for Artificial Intelligence

Decision Diagrams for Optimization

Towards a Code of Ethics for Artificial Intelligence

Author

More products from Tan, Xu

Subjects