Services like social networks, web analytics, and intelligent e-commerce often need to manage data at a scale too big for a traditional database. As scale and demand increase, so does Complexity. Fortunately, scalability and simplicity are not mutually exclusive—rather than using some trendy technology, a different approach is needed. Big data systems use many machines working in parallel to store and process data, which introduces fundamental challenges unfamiliar to most developers.
Big Data shows how to build these systems using an architecture that takes advantage of clustered hardware along with new tools designed specifically to capture and analyze web-scale data. It describes a scalable, easy to understand approach to big data systems that can be built and run by a small team. Following a realistic example, this book guides readers through the theory of big data systems, how to use them in practice, and how to deploy and operate them once they're built.
Purchase of the print book comes with an offer of a free PDF, ePub, and Kindle eBook from Manning. Also available is all code from the book.
Nathan Marz is an engineer at Twitter. He was previously Lead Engineer at BackType, a marketing intelligence company that was acquired by Twitter in July of 2011. He is the author of two major open source projects: Storm, a distributed realtime computation system, and Cascalog, a tool for processing data on Hadoop. He is a frequent speaker and writes a blog at nathanmarz.com.
James Warren is an analytics architect at Storm8 with a background in big data processing, machine learning and scientific computing.
Powell's City of Books is an independent bookstore in Portland, Oregon, that fills a whole city block with more than a million new, used, and out of print books. Shop those shelves — plus literally millions more books, DVDs, and gifts — here at Powells.com.