Description

Book Synopsis

When it comes to data analytics, it pays tothink big. PySpark blends the powerful Spark big data processing engine withthe Python programming language to provide a data analysis platform that can scaleup for nearly any task. Data Analysis with Python and PySpark is yourguide to delivering successful Python-driven data projects.

Data Analysis with Python and PySpark is a carefully engineered tutorial that helps you use PySpark to deliver your data-driven applications at any scale. This clear and hands-on guide shows you how to enlarge your processing capabilities across multiple machines with data from any source, ranging from Had oop-based clusters to Excel worksheets. You'll learn how to break down big analysis tasks into manageable chunks and how to choose and use the best PySpark data abstraction for your unique needs.

The Spark data processing engine is an amazing analytics factory: raw data comes in,and insight comes out. Thanks to its ability to handle massive amounts of data distributed across a cluster, Spark has been adopted as standard by organizations both big and small. PySpark, which wraps the core Spark engine with a Python-based API, puts Spark-based data pipelines in the hands of programmers and data scientists working with the Python programming language. PySpark simplifies Spark's steep learning curve, and provides a seamless bridge between Spark and an ecosystem of Python-based data science tools.



Trade Review

“A great and gentle introduction to spark.” Javier Collado Cabeza

“A phenomenal introduction to PySpark from the ground up.”Anonymous Reviewer

“A great book to get you started with PySpark!” Jeremy Loscheider

“Takes you on an example focused tour of building pyspark data structures from the data you provide and processing them at speed.” Alex Lucas

“If you need to learn PySpark (as a Data Scientist or Data Wrangler) start with this book!”Geoff Clark

Data Analysis with Python and PySpark

    Product form

    £40.85

    Includes FREE delivery

    RRP £45.39 – you save £4.54 (10%)

    Order before 4pm today for delivery by Wed 1 Jul 2026.

    A Paperback / softback by Jonathan Rioux

    7 in stock

      Trusted by thousands of customers. See 2,385+ Customer Reviews

      View other formats and editions of Data Analysis with Python and PySpark by Jonathan Rioux

      Publisher: Manning Publications
      Publication Date: 16/03/2022
      ISBN13: 9781617297205, 978-1617297205
      ISBN10: 1617297208

      Description

      Book Synopsis

      When it comes to data analytics, it pays tothink big. PySpark blends the powerful Spark big data processing engine withthe Python programming language to provide a data analysis platform that can scaleup for nearly any task. Data Analysis with Python and PySpark is yourguide to delivering successful Python-driven data projects.

      Data Analysis with Python and PySpark is a carefully engineered tutorial that helps you use PySpark to deliver your data-driven applications at any scale. This clear and hands-on guide shows you how to enlarge your processing capabilities across multiple machines with data from any source, ranging from Had oop-based clusters to Excel worksheets. You'll learn how to break down big analysis tasks into manageable chunks and how to choose and use the best PySpark data abstraction for your unique needs.

      The Spark data processing engine is an amazing analytics factory: raw data comes in,and insight comes out. Thanks to its ability to handle massive amounts of data distributed across a cluster, Spark has been adopted as standard by organizations both big and small. PySpark, which wraps the core Spark engine with a Python-based API, puts Spark-based data pipelines in the hands of programmers and data scientists working with the Python programming language. PySpark simplifies Spark's steep learning curve, and provides a seamless bridge between Spark and an ecosystem of Python-based data science tools.



      Trade Review

      “A great and gentle introduction to spark.” Javier Collado Cabeza

      “A phenomenal introduction to PySpark from the ground up.”Anonymous Reviewer

      “A great book to get you started with PySpark!” Jeremy Loscheider

      “Takes you on an example focused tour of building pyspark data structures from the data you provide and processing them at speed.” Alex Lucas

      “If you need to learn PySpark (as a Data Scientist or Data Wrangler) start with this book!”Geoff Clark

      Recently viewed products

      © 2026 Book Curl

        • American Express
        • Apple Pay
        • Diners Club
        • Discover
        • Google Pay
        • Maestro
        • Mastercard
        • PayPal
        • Shop Pay
        • Union Pay
        • Visa

        Login

        Forgot your password?

        Don't have an account yet?
        Create account