Synopses & Reviews
Like the popular second edition, Data Mining: Practical Machine Learning Tools and Techniques offers a thorough grounding in machine learning concepts as well as practical advice on applying machine learning tools and techniques in real-world data mining situations. Inside, you'll learn all you need to know about preparing inputs, interpreting outputs, evaluating results, and the algorithmic methods at the heart of successful data mining'including, i.e., the rule onions, potatoes] -> beef] found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, he or she is also likely to buy beef. The authors inlcude both tried-and-true techniques of today as well as methods at the leading edge of contemporary research.
Complementing the book is a fully functional platform-independent open source Weka software for machine learning, available for free download.
The book is a major revision of the second edition that appeared in 2005. While the basic core remains the same, it has been updated to reflect the changes that have taken place over the last four or five years. The highlights for the updated new edition include completely revised technique sections; new chapter on Data Transformations, new chapter on Ensemble Learning, new chapter on Massive Data Sets, a new ?book release? version of the popular Weka machine learning open source software (developed by the authors and specific to the Third Edition); new material on ?multi-instance learning?; new information on ranking the classification, plus comprehensive updates and modernization throughout. All in all, approximately 100 pages of new material.
* Thorough grounding in machine learning concepts as well as practical advice on applying the tools and techniques
* Algorithmic methods at the heart of successful data mining'including tired and true methods as well as leading edge methods
* Performance improvement techniques that work by transforming the input or output
* Downloadable Weka, a collection of machine learning algorithms for data mining tasks, including tools for data pre-processing, classification, regression, clustering, association rules, and visualization'in an updated, interactive interface.
"The authors provide enough theory to enable practical application, and it is this practical focus that separates this book from most, if not all, other books on this subject."- Dorian Pyle, Director of Modeling at Numetrics and an internationally known author of Data Preparation for Data Mining
(Morgan Kaufmann, 1999) and Business Modeling for Data Mining
(Morgan Kaufmann, 2003)
"This book would be a strong contender for a technical data mining course. It is one of the best of its kind."- Herb Edelstein, Principal, Data Mining Consultant, Two Crows Consulting.
"It is certainly one of my favorite data mining books in my library"- Tom Breur, Principal, XLNT Consulting, Tilburg, The Netherlands
Data Mining: Practical Machine Learning Tools and Techniques
offers a thorough grounding in machine learning concepts as well as practical advice on applying machine learning tools and techniques in real-world data mining situations. This highly anticipated third edition of the most acclaimed work on data mining and machine learning will teach you everything you need to know about preparing inputs, interpreting outputs, evaluating results, and the algorithmic methods at the heart of successful data mining.
Thorough updates reflect the technical changes and modernizations that have taken place in the field since the last edition, including new material on Data Transformations, Ensemble Learning, Massive Data Sets, Multi-instance Learning, plus a new version of the popular Weka machine learning software developed by the authors. Witten, Frank, and Hall include both tried-and-true techniques of today as well as methods at the leading edge of contemporary research.
*Provides a thorough grounding in machine learning concepts as well as practical advice on applying the tools and techniques to your data mining projects *Offers concrete tips and techniques for performance improvement that work by transforming the input or output in machine learning methods *Includes downloadable Weka software toolkit, a collection of machine learning algorithms for data mining tasks-in an updated, interactive interface. Algorithms in toolkit cover: data pre-processing, classification, regression, clustering, association rules, visualization
If you have data you want to analyze and understand, this book and the associated WEKA Toolkit will get you the results you seek!
About the Author
<>Ian H. Witten
is a professor of computer science at the University of Waikato in New Zealand. He directs the New Zealand Digital Library research project. His research interests include information retrieval, machine learning, text compression, and programming by demonstration. He received an MA in Mathematics from Cambridge University, England; an MSc in Computer Science from the University of Calgary, Canada; and a PhD in Electrical Engineering from Essex University, England. He is a fellow of the ACM and of the Royal Society of New Zealand. He has published widely on digital libraries, machine learning, text compression, hypertext, speech synthesis and signal processing, and computer typography. He has written several books, the latest being Managing Gigabytes (1999) and Data Mining (2000), both from Morgan Kaufmann.Eibe Frank
lives in New Zealand with his Samoan spouse and two lovely boys, but originally hails from Germany, where he received his first degree in computer science from the University of Karlsruhe. He moved to New Zealand to pursue his Ph.D. in machine learning under the supervision of Ian H. Witten, and joined the Department of Computer Science at the University of Waikato as a lecturer on completion of his studies. He is now an associate professor at the same institution. As an early adopter of the Java programming language, he laid the groundwork for the Weka software described in this book. He has contributed a number of publications on machine learning and data mining to the literature and has refereed for many conferences and journals in these areas.>Mark A. Hall was born in England but moved to New Zealand with his parents as a young boy. He now lives with his wife and four young children in a small town situated within an hour’s drive of the University of Waikato. He holds a bachelor’s degree in computing and mathematical sciences and a Ph.D. in computer science, both from the University of Waikato. Throughout his time at Waikato, as a student and lecturer in computer science and more recently as a software developer and data mining consultant for Pentaho, an open-source business intelligence software company, Mark has been a core contributor to the Weka software described in this book. He has published a number of articles on machine learning and data mining and has refereed for conferences and journals in these areas.
University of Waikato, Hamilton, New Zealand. Recipient of the 2005 ACM SIGKDD Service Award.
Table of Contents
PART I: Introduction to Data Mining Ch 1 What's It All About? Ch 2 Input: Concepts, Instances, Attributes Ch 3 Output: Knowledge Representation Ch 4 Algorithms: The Basic Methods Ch 5 Credibility: Evaluating What's Been Learned PART II: Advanced Data Mining
Ch 6 Implementations: Real Machine Learning Schemes Ch 7 Data Transformation Ch 8 Ensemble Learning Ch 9 Moving On: Applications and Beyond PART III: The Weka Data MiningWorkbench Ch 10 Introduction to Weka Ch 11 The Explorer Ch 12 The Knowledge Flow Interface Ch 13 The Experimenter Ch 14 The Command-Line Interface Ch 15 Embedded Machine Learning Ch 16 Writing New Learning Schemes Ch 17 Tutorial Exercises for the Weka Explorer