Speech Recognition Resources


As a part of my major and minor in Computer Science and Linguistics, respectively, I did some independent study work with speech recognition. Any material set apart in bold text are recommend -- these items either were referenced more or appeared to be more interesting over the others.

On this page: Back to my home page..

WWW Sites and Searches


The following sites contain links to other web sites and non-web resources containing related information.

SpeechLinks
Alex Hauptmann's Home Page
Yahoo! - Business and Economy:Companies:Computers:Software:Voice Recognition
HotBot -Results
Jason Eisner -Home Page
Spoken Language Systems Group
Link Page
Speech Communication Lab. Publications
Speech Related Sites
Speech Research Groups
Speech Understanding
ZIA Computer Science, Artificial Intelligence, Natural Language Processing, Speech Recognition Resources
21st Century Eloquence: Speech Recognition / Voice Recognition Specialists
Tur’s Gaf
Speech


Current Work.. Specific Research & Applications

BBN Speech and Language Processing
Contains information on BYBLOS, their speaker independent system. Claims high accuracy, and their mission is to “develop the technologies that will enable multi-modal, multi-lingual, human-to-human and human-to-machine communication, collaboration, and information access -- anytime, anywhere, anyhow.”

Microsoft Research Natural Language Processing Group
Speech Systems, Inc. - Advanced Speech Recognition for Computers
VODIS


Current Work.. General Research and Information

http://www.ta.doc.gov/aptp/japan/techlit/speech2.htm
Japan's research into speech recognition in general. Contains historical information on Japan's work, links and information for those groups who have contributed to their research, and also a reference list. Assumed to be related to non-native English speech; all the sites and material referenced are located in Japan.

THE USE OF WORD, PHRASE AND INTENT ACCURACY AS MEASURES OF CONNECTED SPEECH RECOGNITION PERFORMANCE
A study of the accuracy of a speech recognition system when used in aircraft cockpits. Phrases divided into three groups: complex, simple, and no-alternate. In their study, system performance was measured in terms of word, phrase, and intent accuracy. May not involve non-native speakers of english, although the work was done with a speaker-independent system. Good site to learn about how to evaluate and analyze a speech system.

Multicom Research Inc.
The site itself is not helpful; however, it contains a link to http://www.is.cs.cmu.edu/VODIS/, which contains information about the Voice Operated Driving Information System. VODIS research includes non-native English speech.

comp.speech WWW sites
http://www.speech.cs.cmu.edu/comp.speech/index.html or http://fortis.speech.su.oz.au/comp.speech/
An FAQ on speech technology and related areas. Contains sections on NLP, and many links to sites with related information, and also a variety of newsgroups and mailing lists. Excellent site for learning speech technology fundamentals.

CSLU Home Page
Center For Spoken Language Understanding. Contains an introduction on research in these three areas: spoken language understanding, speaker recognition, and automatic language identification. Also contains summaries on various projects, including available speech corpora for downloading.

Survey of the State of the Art in Human Language Technology


Potential Software

IBM - Simply Speaking
ART Advanced Recognition Technologies
Voice Recognition & Voice-activated Medical Reporting from Kurzweil AI
Dragon Systems Speech and Voice Recognition Software, including DragonDictate (Dragon Dictate)
Verbex Voice Systems


Success Stories, Reviews, other notes

PC Magazine -- Trends Online: IBM Reveals Details of New, Improved OS/2 (4/23/96)
This document focuses on the addition of voice commands to an upcoming version of OS/2, codenamed Merlin.

Customer success stories
The contents of this page are undoubtedly biased towards IBM. Something else of interest is that one of the stories mentions having to train the system to recognize the user's speech.

1997 May Be Year for Speech Recognition
Compares current systems with work done in the past. Companies referenced: IBM, Motorola, Dragon Systems, and Kurzweil. Briefly discusses the shortcomings of discrete speech expected from current "cheap" systems.

The DragonDictate / Speech Recognition FAQ
Volunteer-maintained site. Contains links to other software companies, and contains similar material to Dragon Systems own FAQ. Also contains a comparison between DD and Kurzweil's speech systems and brief discussion on IBM's VoiceType system.


Downloadable Software

Jerry's Windows 95 Miscellaneous Page
IBM VoiceType
Voice Recognition Speech Recognition Voice Recognition
Verbex Listen FreeWare
Creative MultiMedia Tools


No Demo Software Available (online)
Dragon Systems Speech and Voice Recognition Software, including DragonDictate (Dragon Dictate)
ART Advanced Recognition Technologies


<= $100
Voice Recognition & Speech Recognition-Kurzweil AI General Store
IBM - Simply Speaking
http://www.dragonsys.com/marketing/singles.html
smARTshop - Main
Verbex New Order Form

Newsnet Newsgroup

comp.speech

Texts

  • Markowitz, Judith A. Using Speech Recognition. Prentice-Hall PTR. New Jersey. 1996.
    This text provides a broad overview of current speech technology. It covers all aspects of speech systems: system design, implementation, and evaluation; user and speaking environment issues, future direction, and historical information. It focuses on speech recognition and has little discussion on the other types of speech systems.

  • Syrdal, Ann K, et al. Applied Speech Technology. CRC Press, Inc. Boca Raton. 1995.
    This text provides a more detailed look at all currently available speech technology, including case studies.

  • Finegan, Edward. Language: Its Structure and Use. Second Edition. Harcourt Brace & Company. USA. 1994.
  • Jannedy, Stefanie, eds, et al. Language Files: Materials for an Introduction to Language & Linguistics. Sixth Edition. Ohio State University Press. Columbus. 1994
  • O'Grady, William, and Dobrovolsky, Michael, ed. Contemporary Linguistics: An Introduction. Third Edition. St. Martins' Press. New York. 1997
    Finegan and Jannedy were valuable in reviewing the introductory phonology and phonetics material presented in UWM course 550-350. The O'Grady and Dobrovolosky textbook may seem "friendlier" to linguistics students, the material in this text is a bit clearer than the Finegan text.

  • Borden, Gloria J. and Harris, Katherine S. Speech Science Primer; Physiology, Acoustics, and Perception of Speech. Waverly Press, Inc. 1980.
    Contains a detailed overview of speech production and perception. The layout is cumbersome at times, but the content is informative.


    The following texts were considered but not used for this project.

  • Kaisse, Ellen. Connected Speech; The Interaction of Syntax and Phonology. Academic Press, Inc. Orlando. 1985.
    This text gets into detail about phonological processes that occur in fast speech, and also includes discussions on the processes that occur in different languages and their dialects including English, Puerto Rican Spanish, Modern Greek, and Japanese. Potential source of information on the different processes that can affect the accuracy of speech systems.

  • Bird, Steven. Studies in Natural Language Processing; Computational Phonology; A Constraint-Based Approach. Cambridge University Press. Great Britain. 1995.
    This text is also included as a potential source of information relevant to speech systems.

  • Grisham, Ralph. Studies in Natural Language Processing; Computational Linguistics; An Introduction. Cambridge University Press. Great Britain. 1986.
    Contains a general overview of computational linguistics. Includes discussions on language generation and syntax, semantic, and discourse analysis. Very brief discussion of speech understanding (pp. 87-89). Not referred to for this project.

  • Kaye, Jonathan. Tutorial Essays in Cognitive Science; Phonology: A Cognitive View. Lawrence Erlbaum Associates, Inc. Hillsdale. 1989
    This text was also not used for this project. Kaye provides interesting comments on generative phonology and current trends in speech systems (Ch. 5).


    Feel free to give me any feedback on this list.
    Go back to my home page..