Spark and Big Data Processing with R and Python : Building Scalable Pipelines with Sparklyr, PySpark, and Databricks
Overview
Reactive Publishing
Learn big data processing with Spark using R and Python.
This book offers a practical introduction to Apache Spark for large-scale data work. It focuses on building scalable data pipelines using Sparklyr, PySpark, and Databricks.
You will discover how to:
- Work with large datasets using Apache Spark
- Use Sparklyr to integrate Spark with R
- Develop with PySpark in Python
- Leverage Databricks for cloud-based Spark environments
- Construct reliable data pipelines for real-world use
The book bridges the R and Python ecosystems, helping data professionals use the right language for different Spark tasks. It covers core concepts including data transformation, distributed processing, and moving projects from development to production.
Written for data analysts, data scientists, and engineers who want to add Spark to their toolkit.
Perfect for: Professionals looking to expand their skills in big data processing with Spark across both R and Python.
This item is Non-Returnable
Customers Also Bought
Details
- ISBN-13: 9798198822771
- ISBN-10: 9798198822771
- Publisher: Independently Published
- Publish Date: May 2026
- Dimensions: 9 x 6 x 1.05 inches
- Shipping Weight: 1.12 pounds
- Page Count: 424
Related Categories
