The Fictioning Horror Sale
 
 

Recently Viewed clear list


Interviews | September 2, 2014

Jill Owens: IMG David Mitchell: The Powells.com Interview



David MitchellDavid Mitchell's newest mind-bending, time-skipping novel may be his most accomplished work yet. Written in six sections, one per decade, The Bone... Continue »
  1. $21.00 Sale Hardcover add to wish list

    The Bone Clocks

    David Mitchell 9781400065677

spacer
Qualifying orders ship free.
$34.99
New Trade Paper
Ships in 1 to 3 days
Add to Wishlist
Qty Store Section
1 Beaverton Internet- Servers

More copies of this ISBN

This title in other editions

Programming Hive

by

Programming Hive Cover

 

Synopses & Reviews

Publisher Comments:

Need to move a relational database application to Hadoop? This comprehensive guide introduces you to Apache Hive, Hadoops data warehouse infrastructure. Youll quickly learn how to use Hives SQL dialect—HiveQL—to summarize, query, and analyze large datasets stored in Hadoops distributed filesystem.

This example-driven guide shows you how to set up and configure Hive in your environment, provides a detailed overview of Hadoop and MapReduce, and demonstrates how Hive works within the Hadoop ecosystem. Youll also find real-world case studies that describe how companies have used Hive to solve unique problems involving petabytes of data.

  • Use Hive to create, alter, and drop databases, tables, views, functions, and indexes
  • Customize data formats and storage options, from files to external databases
  • Load and extract data from tables—and use queries, grouping, filtering, joining, and other conventional query methods
  • Gain best practices for creating user defined functions (UDFs)
  • Learn Hive patterns you should use and anti-patterns you should avoid
  • Integrate Hive with other data processing programs
  • Use storage handlers for NoSQL databases and other datastores
  • Learn the pros and cons of running Hive on Amazons Elastic MapReduce

Synopsis:

Hive makes life much easier for developers who work with stored and managed data in Hadoop clusters, such as data warehouses. With this example-driven guide, youll learn how to use the Hive infrastructure to provide data summarization, query, and analysis—particularly with HiveQL, the query language dialect of SQL.

Youll learn how to set up Hive in your environment and optimize its use, and how it interoperates with other tools, such as HBase. Youll also learn how to extend Hive with custom code written in Java or scripting languages. Ideal for developers with prior SQL experience, this book shows you how Hive simplifies many tasks that would be much harder to implement in the lower-level MapReduce API provided by Hadoop.

About the Author

Edward Capriolo is currently System Administrator at Media6degrees where he helps design and maintain distributed data storage systems for the internet advertising industry.

Edward is a member of the Apache Software Foundation and a committer for the Hadoop-Hive project. He has experience as a developer as well Linux and network administrator and enjoys the rich world of open source software.

Dean Wampler is a Principal Consultant at Think Big Analytics, where he specializes in "Big Data" problems and tools like Hadoop and Machine Learning. Besides Big Data, he specializes in Scala, the JVM ecosystem, JavaScript, Ruby, functional and object-oriented programming, and Agile methods. Dean is a frequent speaker at industry and academic conferences on these topics. He has a Ph.D. in Physics from the University of Washington.

Jason Rutherglen is a software architect at Think Big Analytics and specializes in Big Data, Hadoop, search, and security.

Table of Contents

PrefaceChapter 1: IntroductionChapter 2: Getting StartedChapter 3: Data Types and File FormatsChapter 4: HiveQL: Data DefinitionChapter 5: HiveQL: Data ManipulationChapter 6: HiveQL: QueriesChapter 7: HiveQL: ViewsChapter 8: HiveQL: IndexesChapter 9: Schema DesignChapter 10: TuningChapter 11: Other File Formats and CompressionChapter 12: DevelopingChapter 13: FunctionsChapter 14: StreamingChapter 15: Customizing Hive File and Record FormatsChapter 16: Hive Thrift ServiceChapter 17: Storage Handlers and NoSQLChapter 18: SecurityChapter 19: LockingChapter 20: Hive Integration with OozieChapter 21: Hive and Amazon Web Services (AWS)Chapter 22: HCatalogChapter 23: Case StudiesGlossaryReferencesColophon

Product Details

ISBN:
9781449319335
Author:
Capriolo, Edward
Publisher:
O'Reilly Media
Author:
Wampler, Dean
Author:
Rutherglen, Jason
Subject:
Programming Languages - Java
Subject:
Database-Data Warehousing
Subject:
HBase;SQL;database;hadoop;hive;hiveql;java;programming;query language;scripting language
Subject:
CourseSmart Subject Description
Copyright:
Edition Description:
Print PDF
Publication Date:
20121006
Binding:
Paperback
Language:
English
Pages:
352
Dimensions:
9.19 x 7 in

Other books you might like

  1. Mahout in Action New Trade Paper $44.99
  2. Hadoop: The Definitive Guide Used Trade Paper $21.00

Related Subjects

Computers and Internet » Computer Languages » SQL
Computers and Internet » Computers Reference » General
Computers and Internet » Database » Database Management
Computers and Internet » Internet » Apache
Computers and Internet » Internet » Servers
Science and Mathematics » Mathematics » General

Programming Hive New Trade Paper
0 stars - 0 reviews
$34.99 In Stock
Product details 352 pages O'Reilly Media - English 9781449319335 Reviews:
"Synopsis" by ,

Hive makes life much easier for developers who work with stored and managed data in Hadoop clusters, such as data warehouses. With this example-driven guide, youll learn how to use the Hive infrastructure to provide data summarization, query, and analysis—particularly with HiveQL, the query language dialect of SQL.

Youll learn how to set up Hive in your environment and optimize its use, and how it interoperates with other tools, such as HBase. Youll also learn how to extend Hive with custom code written in Java or scripting languages. Ideal for developers with prior SQL experience, this book shows you how Hive simplifies many tasks that would be much harder to implement in the lower-level MapReduce API provided by Hadoop.

spacer
spacer
  • back to top
Follow us on...




Powell's City of Books is an independent bookstore in Portland, Oregon, that fills a whole city block with more than a million new, used, and out of print books. Shop those shelves — plus literally millions more books, DVDs, and gifts — here at Powells.com.