Description
For developers working with big data, it's not enough to have a theoretical understanding of Hadoop. They need to solve real challenges like analyzing real-time streams, moving data securely between storage systems, and managing large-scale clusters. The Hadoop ecosystem is constantly growing, and it's important they keep up with the new technologies and practices to stay productive and future-proof data systems.
Hadoop in Practice, Second Edition provides over 100 tested, instantly-useful techniques that will help conquer big data, using Hadoop. This revised new edition covers changes and new features in the Hadoop core architecture, including MapReduce 2. Brand new chapters cover YARN, real-time use cases, and integrating Kafka, Storm, and Spark with Hadoop. There’s also a new and updated techniques for Flume, Sqoop, and Mahout, all of which have seen major new versions recently. In short, this is the most practical, up-to-date coverage of Hadoop available anywhere.
RETAIL SELLING POINTS Practical up-to-date coverage Over 100 practical, battle-tested Hadoop techniques Major updates to key technologies
AUDIENCE
Readers should be familiar with Hadoop and have experience programming in Java or another OOP language.
ABOUT THE TECHNOLOGY
Hadoop is an open source MapReduce platform designed to query and analyze data distributed across large clusters. Especially effective for big data systems, Hadoop powers mission-critical software at Apple, eBay, LinkedIn, Yahoo, and Facebook. It offers organizations efficient ways to store, manage, and analyze data.