- Used Books
- Staff Picks
- Gifts & Gift Cards
- Sell Books
- Stores & Events
- Let's Talk Books
Special Offers see all
More at Powell's
Recently Viewed clear list
New Trade Paper
Ships in 1 to 3 days
Other titles in the Addison-Wesley Data & Analytics series:
Apache Hadoop Yarn: Moving Beyond Mapreduce and Batch Processing with Apache Hadoop 2 (Addison-Wesley Data and Analytics)by Arun C Murthy
Synopses & Reviews
“This book is a critically needed resource for the newly released Apache Hadoop 2.0, highlighting YARN as the significant breakthrough that broadens Hadoop beyond the MapReduce paradigm.”
—From the Foreword by Raymie Stata, CEO of Altiscale
The Insider’s Guide to Building Distributed, Big Data Applications with Apache Hadoop™ YARN
Apache Hadoop is helping drive the Big Data revolution. Now, its data processing has been completely overhauled: Apache Hadoop YARN provides resource management at data center scale and easier ways to create distributed applications that process petabytes of data. And now in Apache Hadoop™ YARN, two Hadoop technical leaders show you how to develop new applications and adapt existing code to fully leverage these revolutionary advances.
YARN project founder Arun Murthy and project lead Vinod Kumar Vavilapalli demonstrate how YARN increases scalability and cluster utilization, enables new programming models and services, and opens new options beyond Java and batch processing. They walk you through the entire YARN project lifecycle, from installation through deployment.
You’ll find many examples drawn from the authors’ cutting-edge experience—first as Hadoop’s earliest developers and implementers at Yahoo! and now as Hortonworks developers moving the platform forward and helping customers succeed with it.
Apache Hadoop is right at the heart of the Big Data revolution. In the brand-new Release 2, Hadoop’s data processing has been thoroughly overhauled. The result is Apache Hadoop YARN, a generic compute fabric providing resource management at datacenter scale, and a simple method to implement distributed applications such as MapReduce to process petabytes of data on Apache Hadoop HDFS. Apache Hadoop 2 and YARN truly deserve to be called breakthroughs.
In Apache Hadoop YARN , key YARN developer Arun Murthy shows how the key design changes in Apache Hadoop lead to increased scalability and cluster utilization, new programming models and services, and the ability to move beyond Java and batch processing within the Hadoop ecosystem. Readers also learn to run existing applications like Pig and Hive under the Apache Hadoop 2 MapReduce framework, and develop new applications that take absolutely full advantage of Hadoop YARN resources. Drawing on insights from the entire Apache Hadoop 2 team, Murthy and Dr. Douglas Eadline:
About the Author
Arun Murthy (California) has contributed to Apache Hadoop full-time since the inception of the project in early 2006. He is a long-term Hadoop Committer and a member of the Apache Hadoop Project Management Committee. Previously, he was the architect and lead of the Yahoo Hadoop Map-Reduce development team and was ultimately responsible, technically, for providing Hadoop Map-Reduce as a service for all of Yahoo - currently running on nearly 50,000 machines! Arun is the Founder and Architect of the Hortonworks Inc., a software company that is helping to accelerate the development and adoption of Apache Hadoop. Hortonworks was formed by the key architects and core Hadoop committers from the Yahoo! Hadoop software engineering team in June 2011 in order to accelerate the development and adoption of Apache Hadoop. Funded by Yahoo! and Benchmark Capital, one of the preeminent technology investors, their goal is to ensure that Apache Hadoop becomes the standard platform for storing, processing, managing and analyzing big data. He lives in Silicon Valley in California.
Douglas Eadline (Pennsylvania), PhD, began his career as a practitioner and a chronicler of the Linux Cluster HPC revolution and now documents big data analytics. Starting with the first Beowulf How To document, Dr. Eadline has written hundreds of articles, white papers, and instructional documents covering virtually all aspects of HPC computing. Prior to starting and editing the popular ClusterMonkey.net web site in 2005, he served as Editorinchief for ClusterWorld Magazine, and was Senior HPC Editor for Linux Magazine. Currently, he is a consultant to the HPC industry and writes a monthly column in HPC Admin Magazine. Both clients and readers have recognized Dr. Eadline's ability to present a "technological value proposition" in a clear and accurate style. He has practical hands on experience in many aspects of HPC including, hardware and software design, benchmarking, storage, GPU, cloud, and parallel computing.
Table of Contents
1. YARN Quick Start
2. YARN and the Hadoop Ecosystem
3. Functional Overview of YARN Components
4. Installing YARN
5. Running Applications with YARN
6. YARN Administration
7. YARN Architecture Guide
8. Writing a Simple YARN Application
9. Using YARN Distributed Shell
10. Accelerating Applications with Tez
11. YARN Frameworks
Appendix A. Navigating and Joining the Hadoop Ecosystem
Appendix B. YARN Software API Reference
What Our Readers Are Saying
Computers and Internet » Computers Reference » General