- Used Books
- Staff Picks
- Gifts & Gift Cards
- Sell Books
- Stores & Events
- Let's Talk Books
Special Offers see all
More at Powell's
Recently Viewed clear list
More copies of this ISBN
Apache Sqoop Cookbookby Kathleen Ting
Synopses & Reviews
Integrating data from multiple sources is essential in the age of big data, but it can be a challenging and time-consuming task. This handy cookbook provides dozens of ready-to-use recipes for using Apache Sqoop, the command-line interface application that optimizes data transfers between relational databases and Hadoop.
Sqoop is both powerful and bewildering, but with this cookbooks problem-solution-discussion format, youll quickly learn how to deploy and then apply Sqoop in your environment. The authors provide MySQL, Oracle, and PostgreSQL database examples on GitHub that you can easily adapt for SQL Server, Netezza, Teradata, or other relational systems.
Relational database systems often store valuable data in a company. If made available, that data can be managed and processed by Apache Hadoop, which is fast becoming the standard for big data processing. As a result, relational database vendors have developed integration with Hadoop within one or more of their products. Transferring data to and from relational databases is challenging and laborious. Because data transfer requires careful handling, Apache Sqoop, short for “SQL to Hadoop,” was created to perform bidirectional data transfer between Hadoop and almost any external structured datastore. In this book, we'll focus on applying the arguments in common use cases to help you deploy and use Sqoop in your environment.
Given that Sqoop was designed for power users, there is a need for a standalone recipe-based book on Sqoop that will cover common usage and use cases. the various options of the Sqoop Import and Export commands, Sqoop's integration with Oozie, Hive, HBase, special connectors, and common issues.
About the Author
Kathleen Ting is currently a Customer Operations Engineering Manager at Cloudera where she helps customers deploy and use the Hadoop ecosystem in production. She has spoken on Hadoop, ZooKeeper, and Sqoop at many Big Data conferences including Hadoop World, ApacheCon, and OSCON. She's contributed to several projects in the open source community and is a Committer and PMC Member on Sqoop.
Jarek Jarcec Cecho is currently a Software Engineer at Cloudera where he develops software to help customers better access and integrate with the Hadoop ecosystem. He has led the Sqoop community in the architecture of the next generation of Sqoop, known as Sqoop 2. He's contributed to several projects in the open source community and is a Committer and PMC Member on Sqoop, Flume, and MRUnit.
Table of Contents
ForewordPrefaceChapter 1: Getting StartedChapter 2: Importing DataChapter 3: Incremental ImportChapter 4: Free-Form Query ImportChapter 5: ExportChapter 6: Hadoop Ecosystem IntegrationChapter 7: Specialized ConnectorsColophon
What Our Readers Are Saying
Other books you might like
Computers and Internet » Database » Design