shopping cart
Call us:  800-878-7323 HELP
McAfee SECURE helps keep you safe from identity theft, credit card fraud, spyware, spam, viruses and online scams.
Powell's Q&A, Q&A | June 29, 2009

All posts by Janna Cawrse Esarey Powell's Q&A: Janna Cawrse Esarey

"I fell in love with Crosby, Stills, and Nash's song 'Southern Cross' when I was fifteen. By the time I got to college, 'I'm going to sail around the world someday' was sort of my pickup line." Continue »


  1. $10.50 Sale Trade Paper add to wish list

Ships free on qualified orders.
$93.25
TRADE PAPER, NEW
Ships in 1 to 3 days
Add to Wishlist
available for shipping or prepaid pickup only
Available for In-store Pickup
in 7 to 12 days
Qty Store Section
13 Remote Warehouse Software Engineering- Programming and Languages


Spoken Language Processing: A Guide to Theory, Algorithm and System Development

by Xuedong Huang

Spoken Language Processing: A Guide to Theory, Algorithm and System Development Cover

Synopses & Reviews

Publisher Comments:

  • New advances in spoken language processing: theory and practice
  • In-depth coverage of speech processing, speech recognition, speech synthesis, spoken language understanding, and speech interface design
  • Many case studies from state-of-the-art systems, including examples from Microsoft's advanced research labs

Spoken Language Processing draws on the latest advances and techniques from multiple fields: computer science, electrical engineering, acoustics, linguistics, mathematics, psychology, and beyond. Starting with the fundamentals, it presents all this and more:

  • Essential background on speech production and perception, probability and information theory, and pattern recognition
  • Extracting information from the speech signal: useful representations and practical compression solutions
  • Modern speech recognition techniques: hidden Markov models, acoustic and language modeling, improving resistance to environmental noises, search algorithms, and large vocabulary speech recognition
  • Text-to-speech: analyzing documents, pitch and duration controls; trainable synthesis, and more
  • Spoken language understanding: dialog management, spoken language applications, and multimodal interfaces

To illustrate the book's methods, the authors present detailed case studies based on state-of-the-art systems, including Microsoft's Whisper speech recognizer, Whistler text-to-speech system, Dr. Who dialog system, and the MiPad handheld device. Whether you're planning, designing, building, or purchasing spoken language technology, this is the state of the art—from algorithms through business productivity.

About the Author

XUEDONG HUANG is founder and head of the Speech Technology Group at Microsoft Research. He received his Ph.D. from the University of Edinburgh. He is an IEEE Fellow.

ALEX ACERO and HSIAO-WUEN HON are Senior Researchers at Microsoft Research and Senior Members of IEEE. Both received doctorates from Carnegie Mellon University.

Foreword by Dr. Raj Reddy, Carnegie Mellon University

Table of Contents

(NOTE: Each chapter ends with Historical Perspective and Further Reading.)

1. Introduction.

Motivations. Spoken Language System Architecture. Book Organization. Target Audiences.

I. FUNDAMENTAL THEORY.

2. Spoken language Structure.

Sound and Human Speech Systems. Phonetics and Phonology. Syllables and Words. Syntax and Semantics.

3. Probability, Statistics, and Information Theory.

Probability Theory. Estimation Theory. Significance Testing. Information Theory.

4. Pattern Recognition.

Bayes' Decision Theory. How to Construct Classifiers. Discriminative Training. Unsupervised Estimation Methods. Classification and Regression Trees.

II. SPEECH PROCESSING.

5. Digital Signal Processing.

Digital Signals and Systems. Continuous-Frequency Transforms. Discrete-Frequency Transforms. Digital Filters and Windows. Digital Processing of Analog Signals. Multirate Signal Processing. Filterbanks. Stochastic Processes.

6. Speech Signal Representations.

Short-Time Fourier Analysis. Acoustical Model of Speech Production. Linear Predictive Coding. Cepstral Processing. Perceptually Motivated Representations. Formant Frequencies. The Role of Pitch.

7. Speech Coding.

Speech Coders Attributes. Scalar Waveform Coders. Scalar Frequency Domain Coders. Code Excited Linear Prediction (CELP). Low-Brit Speech Coders.

III. SPEECH RECOGNITION.

8. Hidden Markov Models.

The Markov Chain. Definition of the Hidden Markov Model. Continuous and Semicontinuous HMMs. Practical Issues in Using HMMs. HMM Limitations.

9. Acoustic Modeling.

Variability in the Speech Signal. How to Measure Speech Recognition Errors. Signal ProcessingExtracting Features. Phonectic ModelingSelecting Appropriate Units. Acoustic ModelingScoring Acoustic Features. Adaptive TechniquesMinimizing Mismatches. Confidence Measures: Measuring the Reliability. Other Techniques. Case Study: Whisper.

10. Environmental Robustness.

The Acoustical Environment. Acoustical Transducers. Adaptive Echo Cancellation (AEC). Multimicrophone Speech Enhancement. Environment Compensation Preprocessing. Environment Model Adaptation. Modeling Nonstationary Noise.

11. Language Modeling.

Formal Language Theory. Stochastic Language Models. Complexity Measure of Language Models. N-Gram Smoothing. Adaptive Language Models. Practical Issues.

12. Basic Search Algorithms.

Basic Search Algorithms. Search Algorithms for Speech Recognition. Language Model States. Time-Synchronous Viterbi Beam Search. Stack Decoding (A Search).

13. Large-Vocabulary Search Algorithms.

Efficient Manipulation of a Tree Lexicon. Other Efficient Search Techniques. N-Best and Multipass Search Strategies. Search-Algorithm Evaluation. Case StudyMicrosoft Whisper.

IV. TEXT-TO-SPEECH SYSTEMS.

14. Text and Phonetic Analysis.

Modules and Data Flow. Lexicon. Document Structured Detection. Text Normalization. Linguistic Analysis. Homograph Disambiguation. Morphological Analysis. Letter-to-Sound Conversion. Evaluation. Case Study: Festival.

15. Prosody.

The Role of Understanding. Prosody Generation Schematic. Speaking Style. Symbolic Prosody. Duration Assignment. Pitch Generation. Prosody Markup Languages. Prosody Evaluation.

16. Speech Synthesis.

Attributes of Speech Synthesis. Formant Speech Synthesis. Concatenative Speech Synthesis. Prosodic Modification of Speech. Source-Filter Models for Prosody Modification. Evaluation of TTS Systems.

V. SPOKEN LANGUAGE SYSTEMS.

17. Spoken Language Understanding.

Written vs. Spoken Languages. Dialog Structure. Semantic Representation. Sentence Interpretation. Discourse Analysis. Dialog Management. Response Generation and Rendition. Evaluation. Case StudyDr. Who.

18. Applications and User Interfaces.

Application Architecture. Typical Applications. Speech Interface Design. Internationalization. Case StudyMIPAD.

Index.


Product Details

ISBN:
9780130226167
Subtitle:
A Guide to Theory, Algorithm and System Development
Foreword:
Reddy, Raj
Author:
Acero, Alex
Author:
Huang, Xuedong
Author:
Hon, Hsiao-Wuen
Author:
Reddy, Raj
Publisher:
Prentice Hall PTR
Location:
Upper Saddle River, NJ
Subject:
Linguistics
Subject:
Telecommunications
Subject:
Networking - General
Subject:
Natural Language Processing
Subject:
Natural language processing (computer science
Subject:
Engineering / Electrical
Copyright:
Edition Description:
Trade paper
Series Volume:
103-200
Publication Date:
April 2001
Binding:
Hardcover
Grade Level:
Professional and scholarly
Language:
English
Illustrations:
Yes
Pages:
1008
Dimensions:
9.08x6.96x1.92 in. 3.15 lbs.

Other books you might like

  1. $44.95 New Trade Paper add to wish list
  2. $125.00 New Hardcover add to wish list
  3. $8.95 Used Trade Paper add to wish list
  4. $53.95 New Trade Paper add to wish list
  5. $133.25 New Trade Paper add to wish list
  6. $82.00 New Hardcover add to wish list

Related Aisles

  • back to top

Powell's City of Books is an independent bookstore in Portland, Oregon, that fills a whole city block with more than a million new, used, and out of print books. Shop those shelves — plus literally millions more books, DVDs, and eBooks — here at Powells.com.