Star Wars Sale
 
 

Special Offers see all

Enter to WIN!

Weekly drawing for $100 credit. Subscribe to PowellsBooks.news for a chance to win.
Privacy Policy

More at Powell's


Recently Viewed clear list


Original Essays | June 20, 2014

Lisa Howorth: IMG So Many Books, So Many Writers



I'm not a bookseller, but I'm married to one, and Square Books is a family. And we all know about families and how hard it is to disassociate... Continue »
  1. $18.20 Sale Hardcover add to wish list

    Flying Shoes

    Lisa Howorth 9781620403013

spacer
Qualifying orders ship free.
$34.99
List price: $39.99
New Trade Paper
Ships in 1 to 3 days
Add to Wishlist
Qty Store Section
1 Beaverton Internet- Servers

More copies of this ISBN

Enterprise Data Workflows with Cascading

by

Enterprise Data Workflows with Cascading Cover

 

Synopses & Reviews

Publisher Comments:

Despite its growing use in the enterprise, building applications for Hadoop is notoriously difficult. But there is a solution. This hands-on book introduces you to Cascading, the framework that enables you to build powerful data processing applications on Hadoop without having to spend months learning the intricacies of MapReduce.

Whether youre a developer, data scientist, or system/IT administrator, youll quickly learn Cascadings streamlined approach to data processing, data filtering, and workflow optimization, using sample apps based on Java, Scala, and Clojure. Companies such as Etsy, Razorfish, TeleNav, and Twitter already use Cascading for mission-critical applications. This book shows you how this framework can help your organization extract meaningful information from large amounts of distributed data.

  • Examine best practices for using data science in enterprise-scale apps
  • Learn how to use workflows that reach beyond MapReduce to integrate other popular Big Data frameworks
  • Quickly build and test applications with familiar constructs and reusable components, and instantly deploy them onto large clusters
  • Easily discover, model, and analyze both unstructured and semi-structured data in any format and from any source
  • Seamlessly move and scale application deployments from development to production, regardless of cluster location or data size

Synopsis:

There is an easier way to build Hadoop applications. With this hands-on book, youll learn how to use Cascading, the open source abstraction framework for Hadoop that lets you easily create and manage powerful enterprise-grade data processing applications—without having to learn the intricacies of MapReduce.

Working with sample apps based on Java and other JVM languages, youll quickly learn Cascadings streamlined approach to data processing, data filtering, and workflow optimization. This book demonstrates how this framework can help your business extract meaningful information from large amounts of distributed data.

  • Start working on Cascading example projects right away
  • Model and analyze unstructured data in any format, from any source
  • Build and test applications with familiar constructs and reusable components
  • Work with the Scalding and Cascalog Domain-Specific Languages
  • Easily deploy applications to Hadoop, regardless of cluster location or data size
  • Build workflows that integrate several big data frameworks and processes
  • Explore common use cases for Cascading, including features and tools that support them
  • Examine a case study that uses a dataset from the Open Data Initiative

About the Author

Paco Nathan is a Data Scientist at Concurrent, Inc., and heads up the developer outreach program there. He has a dual background from Stanford in math/stats and distributed computing, with 25+ years experience in the tech industry. As an expert in Hadoop, R, predictive analytics, machine learning, natural language processing, Paco has built and led several expert Data Science teams, with data infrastructure based on large-scale cloud deployments. He has presented twice on the AWS Start-Up Tour, and gives talks often about Hadoop, Data Science, and Cloud Computing.

Table of Contents

PrefaceChapter 1: Getting StartedChapter 2: Extending Pipe AssembliesChapter 3: Test-Driven DevelopmentChapter 4: Scalding—A Scala DSL for CascadingChapter 5: Cascalog—A Clojure DSL for CascadingChapter 6: Beyond MapReduceChapter 7: The Workflow AbstractionChapter 8: Case Study: City of Palo Alto Open DataTroubleshooting WorkflowsIndexColophon

Product Details

ISBN:
9781449358723
Author:
Nathan, Paco
Publisher:
O'Reilly Media
Subject:
Data processing
Subject:
Computers-Reference - General
Subject:
JVM;Java;big data;cascading;cascalog;clojure;data analysis;data science;devops;enterprise;hadoop;mapreduce;scala;scalding
Copyright:
Edition Description:
Trade Paper
Publication Date:
20130831
Binding:
TRADE PAPER
Language:
English
Pages:
170
Dimensions:
9.19 x 7 in

Related Subjects

Computers and Internet » Computers Reference » General
Computers and Internet » Internet » Apache
Computers and Internet » Internet » Servers
Engineering » Mechanical Engineering » General

Enterprise Data Workflows with Cascading New Trade Paper
0 stars - 0 reviews
$34.99 In Stock
Product details 170 pages O'Reilly Media - English 9781449358723 Reviews:
"Synopsis" by ,

There is an easier way to build Hadoop applications. With this hands-on book, youll learn how to use Cascading, the open source abstraction framework for Hadoop that lets you easily create and manage powerful enterprise-grade data processing applications—without having to learn the intricacies of MapReduce.

Working with sample apps based on Java and other JVM languages, youll quickly learn Cascadings streamlined approach to data processing, data filtering, and workflow optimization. This book demonstrates how this framework can help your business extract meaningful information from large amounts of distributed data.

  • Start working on Cascading example projects right away
  • Model and analyze unstructured data in any format, from any source
  • Build and test applications with familiar constructs and reusable components
  • Work with the Scalding and Cascalog Domain-Specific Languages
  • Easily deploy applications to Hadoop, regardless of cluster location or data size
  • Build workflows that integrate several big data frameworks and processes
  • Explore common use cases for Cascading, including features and tools that support them
  • Examine a case study that uses a dataset from the Open Data Initiative

spacer
spacer
  • back to top
Follow us on...




Powell's City of Books is an independent bookstore in Portland, Oregon, that fills a whole city block with more than a million new, used, and out of print books. Shop those shelves — plus literally millions more books, DVDs, and gifts — here at Powells.com.