Don't Miss

Visit Our Stores

Powell's Staff: Five Book Friday: In Memoriam (0 comment)

Every year, the booksellers at Powell’s submit their Top Fives: their five favorite books that were released in 2023. It’s a list that, when put together, shows just how varied and interesting the book tastes of Powell’s booksellers are. I highly recommend digging into the recommendations — we would never lead you astray — but today...

Brontez Purnell: Powell’s Q&A: Brontez Purnell, author of ‘Ten Bridges I’ve Burnt’ (0 comment)
Rachael P.: Starter Pack: Where to Begin with Ursula K. Le Guin (0 comment)

High Performance Spark Best practices for scaling & optimizing Apache Spark

by Holden Karau, Rachel Warren

ISBN13: 9781491943205
ISBN10: 1491943203
Condition: Standard

All Product Details

$24.00

List Price:~~$49.99~~

Used Trade Paperback

Ships in 1 to 3 days

Qty	Store
1	Burnside

Synopses & Reviews

Synopsis

If you ve successfully used Apache Spark to solve medium sized-problems, but still struggle to realize the "Spark promise" of unparalleled performance on big data, this book is for you. High Performance Spark shows you how take advantage of Spark at scale, so you can grow beyond the novice-level. It s ideal for software engineers, data engineers, developers, and system administrators working with large-scale data applications.

Learn how to make Spark jobs run faster
Productionize exploratory data science with Spark
Handle even larger data sets with Spark
Reduce pipeline running times for faster insights

Synopsis

Apache Spark is amazing when everything clicks. But if you haven't seen the performance improvements you expected, or still don't feel confident enough to use Spark in production, this practical book is for you. Authors Holden Karau and Rachel Warren demonstrate performance optimizations to help your Spark queries run faster and handle larger data sizes, while using fewer resources.

Ideal for software engineers, data engineers, developers, and system administrators working with large-scale data applications, this book describes techniques that can reduce data infrastructure costs and developer hours. Not only will you gain a more comprehensive understanding of Spark, you'll also learn how to make it sing.

With this book, you'll explore:

How Spark SQL's new interfaces improve performance over SQL's RDD data structure
The choice between data joins in Core Spark and Spark SQL
Techniques for getting the most out of standard RDD transformations
How to work around performance issues in Spark's key/value pair paradigm
Writing high-performance Spark code without Scala or the JVM
How to test for functionality and performance when applying suggested improvements
Using Spark MLlib and Spark ML machine learning libraries
Spark's Streaming components and external community packages

What Our Readers Are Saying

Be the first to share your thoughts on this title!

Product Details

ISBN:: 9781491943205
Binding:: Trade Paperback
Publication date:: 07/11/2017
Publisher:: OREILLY & ASSOCIATES INC
Pages:: 356
Height:: .70IN
Width:: 7.00IN
Author:: Holden Karau
Author:: Rachel Warren

$24.00

List Price:~~$49.99~~

Used Trade Paperback

Ships in 1 to 3 days

Qty	Store
1	Burnside

More copies of this ISBN

New, Trade Paperback, $49.99