Synopses & Reviews
Mahout in Action is a hands-on introduction to machine learning with Apache Mahout. Following real-world examples, the book presents practical use cases and then illustrates how Mahout can be applied to solve them. Includes a free audio- and video-enhanced ebook. About the Technology
A computer system that learns and adapts as it collects data can be really powerful. Mahout, Apache's open source machine learning project, captures the core algorithms of recommendation systems, classification, and clustering in ready-to-use, scalable libraries. With Mahout, you can immediately apply to your own projects the machine learning techniques that drive Amazon, Netflix, and others. About this Book
This book covers machine learning using Apache Mahout. Based on experience with real-world applications, it introduces practical use cases and illustrates how Mahout can be applied to solve them. It places particular focus on issues of scalability and how to apply these techniques against large data sets using the Apache Hadoop framework.
This book is written for developers familiar with Java -- no prior experience with Mahout is assumed.
Owners of a Manning pBook purchased anywhere in the world can download a free eBook from manning.com at any time. They can do so multiple times and in any or all formats available (PDF, ePub or Kindle). To do so, customers must register their printed copy on Manning's site by creating a user account and then following instructions printed on the pBook registration insert at the front of the book. What's Inside
- Use group data to make individual recommendations
- Find logical clusters within your data
- Filter and refine with on-the-fly classification
- Free audio and video extras
Table of Contents
- Meet Apache Mahout
PART 1 RECOMMENDATIONS
- Introducing recommenders
- Representing recommender data
- Making recommendations
- Taking recommenders to production
- Distributing recommendation computations
PART 2 CLUSTERING
- Introduction to clustering
- Representing data
- Clustering algorithms in Mahout
- Evaluating and improving clustering quality
- Taking clustering to production
- Real-world applications of clustering
PART 3 CLASSIFICATION
- Introduction to classification
- Training a classifier
- Evaluating and tuning a classifier
- Deploying a classifier
- Case study: Shop It To Me
When computers harness prior experience to improve future performance, a type of artificial intelligence called machine learning has been applied. The Apache Mahout project is focused on three types of machine learning that are of particular interest to modern web developers "recommendation systems, classification, and clustering.
Through real-world examples, Mahout in Action introduces the sorts of problems that these techniques are appropriate for, and then illustrates how Mahout can be applied to solve them. It places particular focus on issues of scalability, and how to apply these techniques at very large scale with the Apache Hadoop framework.
Web 2.0 applications provide a rich user experience, but the parts you can't see are just as important-and impressive. They use powerful techniques to process information intelligently and offer features based on patterns and relationships in data. Algorithms of the Intelligent Web shows readers how to use the same techniques employed by household names like Google Ad Sense, Netflix, and Amazon to transform raw data into actionable information.
Algorithms of the Intelligent Web is an example-driven blueprint for creating applications that collect, analyze, and act on the massive quantities of data users leave in their wake as they use the web. Readers learn to build Netflix-style recommendation engines, and how to apply the same techniques to social-networking sites. See how click-trace analysis can result in smarter ad rotations. All the examples are designed both to be reused and to illustrate a general technique- an algorithm-that applies to a broad range of scenarios.
As they work through the book's many examples, readers learn about recommendation systems, search and ranking, automatic grouping of similar objects, classification of objects, forecasting models, and autonomous agents. They also become familiar with a large number of open-source libraries and SDKs, and freely available APIs from the hottest sites on the internet, such as Facebook, Google, eBay, and Yahoo.
"Algorithms of the Intelligent Web" is an example-driven blueprint for creating applications that collect, analyze, and act on the massive quantities of data users leave in their wake as they use the Web. Readers learn to build Netflix-style recommendation engines, and how to apply the same techniques to social-networking sites.
About the Author
Dr. Haralambos (Babis) Marmanis is a pioneer in the adoption of machine learning techniques for industrial solutions, and also a world expert in supply management. He has about twenty years of experience in developing professional software. Currently, he is the director of R&D and chief architect, for expense management solutions, at Emptoris, Inc. Babis holds a Ph.D. in applied mathematics from Brown University, an M.S. degree in theoretical and applied mechanics from the University of Illinois at Urbana-Champaign, and B.S. and M.S. degrees in civil engineering from the Aristotle University of Thessaloniki in Greece. He was the recipient of the Sigma Xi award for innovative research in 2000, and he is the author of numerous publications in peer-reviewed international scientific journals, conferences, and technical periodicals.Dmitry Babenko is the lead for the data warehouse infrastructure at Emptoris, Inc. He is a software engineer and architect with 13 years of experience in the IT industry. He has designed and built a wide variety of applications and infrastructure frameworks for banking, insurance, supply-chain management, and business intelligence companies. He received a M.S. degree in computer science from Belarussian State University of Informatics and Radioelectronics.