Description

Book Synopsis

Working with big data can be complex and challenging, in part

because of the multiple analysis frameworks and tools required.

Apache Spark is a big data processing framework perfect for analyzing

near-real-time streams and discovering historical patterns in batched

data sets. But Spark goes much further than other frameworks. By

including machine learning and graph processing capabilities, it makes

many specialized data processing platforms obsolete. Spark's unified

framework and programming model significantly lowers the initial

infrastructure investment, and Spark's core abstractions are intuitive for

most Scala, Java, and Python developers.

Spark in Action teaches readers to use Spark for stream and batch data

processing. It starts with an introduction to the Spark architecture and

ecosystem followed by a taste of Spark's command line interface.

Readers then discover the most fundamental concepts and abstractions

of Spark, particularly Resilient Distributed Datasets (RDDs) and the

basic data transformations that RDDs provide. The first part of the

book covers writing Spark applications using the the core APIs.

Readers also learn how to work with structured data using Spark SQL,

how to process near-real time data with Spark Streaming, how to apply

machine learning algorithms with Spark MLlib, how to apply graph

algorithms on graph-shaped data using Spark GraphX, and an

introduction to Spark clustering.

Key Features:

• Clear introduction to Spark

• Teaches how to ingest near real-time data

• Gaining value from big data

• Includes real-life case studies

AUDIENCE

Readers should be familiar with Java, Scala, or Python. No knowledge of

Spark or streaming operations is assumed, but some acquaintance with

machine learning is helpful.

ABOUT THE TECHNOLOGY

Apache Spark is a big data processing framework perfect for analyzing

near-real-time streams and discovering historical patterns in batched data

sets. Spark also offers machine learning and graph processing capabilities.

Spark in Action

    Product form

    £37.99

    Includes FREE delivery

    RRP £39.99 – you save £2.00 (5%)

    Order before 4pm tomorrow for delivery by Fri 19 Jun 2026.

    A Paperback / softback by Petar Zecevic

    Out of stock


      View other formats and editions of Spark in Action by Petar Zecevic

      Publisher: Manning Publications
      Publication Date: 24/11/2016
      ISBN13: 9781617292606, 978-1617292606
      ISBN10: 1617292605

      Description

      Book Synopsis

      Working with big data can be complex and challenging, in part

      because of the multiple analysis frameworks and tools required.

      Apache Spark is a big data processing framework perfect for analyzing

      near-real-time streams and discovering historical patterns in batched

      data sets. But Spark goes much further than other frameworks. By

      including machine learning and graph processing capabilities, it makes

      many specialized data processing platforms obsolete. Spark's unified

      framework and programming model significantly lowers the initial

      infrastructure investment, and Spark's core abstractions are intuitive for

      most Scala, Java, and Python developers.

      Spark in Action teaches readers to use Spark for stream and batch data

      processing. It starts with an introduction to the Spark architecture and

      ecosystem followed by a taste of Spark's command line interface.

      Readers then discover the most fundamental concepts and abstractions

      of Spark, particularly Resilient Distributed Datasets (RDDs) and the

      basic data transformations that RDDs provide. The first part of the

      book covers writing Spark applications using the the core APIs.

      Readers also learn how to work with structured data using Spark SQL,

      how to process near-real time data with Spark Streaming, how to apply

      machine learning algorithms with Spark MLlib, how to apply graph

      algorithms on graph-shaped data using Spark GraphX, and an

      introduction to Spark clustering.

      Key Features:

      • Clear introduction to Spark

      • Teaches how to ingest near real-time data

      • Gaining value from big data

      • Includes real-life case studies

      AUDIENCE

      Readers should be familiar with Java, Scala, or Python. No knowledge of

      Spark or streaming operations is assumed, but some acquaintance with

      machine learning is helpful.

      ABOUT THE TECHNOLOGY

      Apache Spark is a big data processing framework perfect for analyzing

      near-real-time streams and discovering historical patterns in batched data

      sets. Spark also offers machine learning and graph processing capabilities.

      Recently viewed products

      © 2026 Book Curl

        • American Express
        • Apple Pay
        • Diners Club
        • Discover
        • Google Pay
        • Maestro
        • Mastercard
        • PayPal
        • Shop Pay
        • Union Pay
        • Visa

        Login

        Forgot your password?

        Don't have an account yet?
        Create account