- Used Books
- Staff Picks
- Gifts & Gift Cards
- Sell Books
- Stores & Events
- Let's Talk Books
Special Offers see all
More at Powell's
Recently Viewed clear list
New Trade Paper
Ships in 1 to 3 days
Available for In-store Pickup
in 7 to 12 days
Other titles in the Charles River Media Programming series:
Text Mining Application Programmingby Manu Konchady
Synopses & Reviews
Text Mining Application Programming teaches software developers how to mine the vast amounts of information available on the Web, internal networks, and desktop files and turn it into usable data. The book helps developers understand the problems associated with managing unstructured text, and explains how to build your own mining tools using standard statistical methods from information theory, artificial intelligence, and operations research. Each of the topics covered are thoroughly explained and then a practical implementation is provided. The book begins with a brief overview of text data, where it can be found, and the typical search engines and tools used to search and gather this text. It details how to build tools for extracting and using the text, and covers the mathematics behind many of the algorithms used in building these tools. From there you'll learn how to build tokens from text, construct indexes, and detect patterns in text. You'll also find methods to extract the names of people, places, and organizations from an email, a news article, or a Web page. The next portion of the book teaches you how to find information on the Web, the structure of the Web, and how to build spiders to crawl the Web. Text categorization is also described in the context of managing email. The final part of the book covers information monitoring, summarization, and a simple Question and Answer (Q&A) system. The code used in the book is written in Perl, but knowledge of Perl is not necessary to run the software. Developers with an intermediate level of experience with Perl can customize the software. Although the book is about programming, methods are explained with English-like pseudocode and the source code is provided on the CD-ROM. After reading this book, you'll be ready to tap into the bevy of information available online in ways you never thought possible.
Book News Annotation:
After reviewing some standard statistical concepts and linear algebra, this book explains methods for building tokens from text, constructing indexes, and detecting patterns in text that may be helpful to software developers managing unstructured text. The second half covers information extraction, the structure of the web, the development of a search engine, cluster organization, text categorization, and question and answer systems. The CD-ROM contains open source tools to test the text mining functions.
Annotation ©2006 Book News, Inc., Portland, OR (booknews.com)
Book News Annotation:
After reviewing some standard statistical concepts and linear algebra, this book explains methods for building tokens from text, constructing indexes, and detecting patterns in text that may be helpful to software developers managing unstructured text. The second half covers information extraction, the structure of the web, the development of a search engine, cluster organization, text categorization, and question and answer systems. The CD-ROM contains open source tools to test the text mining functions. Annotation Â©2006 Book News, Inc., Portland, OR (booknews.com)
Text mining offers a way for individuals and corporations to exploit the vast amount of information available on the Internet. Text Mining Application Programming teaches developers about the problems of managing unstructured text, and describes how to build tools for text mining using standard statistical methods from Artificial Intelligence and Operations Research. These tools can be used for a variety of fields, including law, business, and medicine. Key topics covered include, information extraction, clustering, text categorization, searching the Web, summarization, and natural language query systems. The book explains the theory behind each topic and algorithm, and then provides a practical solution implementation with which developers and students can experiment. A wide variety of code is also included for developers to build their own custom solutions. After reading through this book developers will be able to tap into the bevy information available online in ways they never thought possible and students will have a thorough understanding of the theory and practical application of text mining.
About the Author
Manu Konchady (Oakton,VA) is a consultant working on open source text mining software. Previously, he worked at Mitre Corp. where he designed and developed software to mine the Internet. He received his Ph.D. in Information Technology from George Mason University and his articles have appeared in Dr. Dobb's Journal and Linux Journal.
Table of Contents
Preface Acknowledgments Chapter 1 Introduction Chapter 2 Mathematics Background Chapter 3 Exploring Text Chapter 4 Markov Models and POS Tagging Chapter 5 Information Extraction Chapter 6 Search Engines Chapter 7 Searching the Web Chapter 8 Clustering Documents Chapter 9 Text Categorization Chapter 10 Summarization Chapter 11 Question and Answer About the CD-ROM Index
What Our Readers Are Saying
Average customer rating based on 1 comment:
Arts and Entertainment » Photography » Technique