{ "item_title" : "Learn Apache Spark", "item_author" : [" Studiod21 Smart Tech Content", "Diego Rodrigues "], "item_description" : "LEARN APACHE SPARK Build Scalable Pipelines with PySpark and OptimizationThis book is designed for students, developers, data engineers, data scientists, and technology professionals who want to master Apache Spark in practice, in corporate environments, public cloud, and modern integrations.You will learn to build scalable pipelines for large-scale data processing, orchestrating distributed workloads with AWS EMR, Databricks, Azure Synapse, and Google Cloud Dataproc. The content covers integration with Hadoop, Hive, Kafka, SQL, Delta Lake, MongoDB, and Python, as well as advanced techniques in tuning, job optimization, real-time analysis, machine learning with MLlib, and workflow automation.Includes:- Implementation of ETL and ELT pipelines with Spark SQL and DataFrames- Data streaming processing and integration with Kafka and AWS Kinesis- Optimization of distributed jobs, performance tuning, and use of Spark UI- Integration of Spark with S3, Data Lake, NoSQL, and relational databases- Deployment on managed clusters in AWS, Azure, and Google Cloud- Applied Machine Learning with MLlib, Delta Lake, and Databricks- Automation of routines, monitoring, and scalability for Big DataBy the end, you will master Apache Spark as a professional solution for data analysis, process automation, and machine learning in complex, high-performance environments.Content reviewed by A.I. with technical supervision.apache spark, big data, pipelines, distributed processing, aws emr, databricks, streaming, etl, machine learning, cloud integration Google Data Engineer, AWS Data Analytics, Azure Data Engineer, Big Data Engineer, MLOps, DataOps Professional", "item_img_path" : "https://covers4.booksamillion.com/covers/bam/9/79/828/970/9798289704603_b.jpg", "price_data" : { "retail_price" : "15.90", "online_price" : "15.90", "our_price" : "15.90", "club_price" : "15.90", "savings_pct" : "0", "savings_amt" : "0.00", "club_savings_pct" : "0", "club_savings_amt" : "0.00", "discount_pct" : "10", "store_price" : "" } }

Learn Apache Spark|Studiod21 Smart Tech Content

Learn Apache Spark : Build Scalable Pipelines with PySpark and Optimization

Name: Learn Apache Spark
SKU: 9798289704603
Price: 15.90 USD
Availability: InStock

by Studiod21 Smart Tech Content and Diego Rodrigues

Ship to Me

In Stock.

FREE Shipping for Club Members

In-Store Pickup

Overview

LEARN APACHE SPARK Build Scalable Pipelines with PySpark and Optimization

This book is designed for students, developers, data engineers, data scientists, and technology professionals who want to master Apache Spark in practice, in corporate environments, public cloud, and modern integrations.

You will learn to build scalable pipelines for large-scale data processing, orchestrating distributed workloads with AWS EMR, Databricks, Azure Synapse, and Google Cloud Dataproc. The content covers integration with Hadoop, Hive, Kafka, SQL, Delta Lake, MongoDB, and Python, as well as advanced techniques in tuning, job optimization, real-time analysis, machine learning with MLlib, and workflow automation.

Includes:

- Implementation of ETL and ELT pipelines with Spark SQL and DataFrames

- Data streaming processing and integration with Kafka and AWS Kinesis

- Optimization of distributed jobs, performance tuning, and use of Spark UI

- Integration of Spark with S3, Data Lake, NoSQL, and relational databases

- Deployment on managed clusters in AWS, Azure, and Google Cloud

- Applied Machine Learning with MLlib, Delta Lake, and Databricks

- Automation of routines, monitoring, and scalability for Big Data

By the end, you will master Apache Spark as a professional solution for data analysis, process automation, and machine learning in complex, high-performance environments.

Content reviewed by A.I. with technical supervision.

apache spark, big data, pipelines, distributed processing, aws emr, databricks, streaming, etl, machine learning, cloud integration Google Data Engineer, AWS Data Analytics, Azure Data Engineer, Big Data Engineer, MLOps, DataOps Professional

This item is Non-Returnable

Customers Also Bought

Details

ISBN-13: 9798289704603
ISBN-10: 9798289704603
Publisher: Independently Published
Publish Date: June 2025
Dimensions: 9 x 6 x 0.54 inches
Shipping Weight: 0.77 pounds
Page Count: 258

Related Categories

Favorites

What We Recommend

Featured

Shop by Category

Fiction

Nonfiction

Shop By Format

More Information

Favorites

Featured

Shop By Author A-G

Shop by Author G-L

Shop by Author R-Z

Shop By Series A-G

Shop By Series H-M

Shop By Series N-Z

Customers Also Liked

More in Manga

Favorites

Favorite Characters

Kids Fiction

Nonfiction

Shop by Age

Top Authors

Educational Resources

More Categories

Favorites

Popular Authors

Bestselling Series A-K

Bestselling Series L-Z

Favorites

Music

Featured

Page to Screen

Tabletop Role-playing

Fandoms

LEGO

Bestsellers

Games & Puzzles

Favorites

Best Books of 2026

#BookTok

Best Gifts for Kids

Toys & Games

For Teens & Young Adults

Pop Culture & Fandoms

Pen to Paper Shop

Faith-Based Gifts

Bargains in Fiction

Bargains in Nonfiction

Bargains in Young Adult Books

Bargains in Kids Fiction

Bargains in Kids Nonfiction

Bargains in Faith & Inspiration

Bargain Favorites

Learn Apache Spark : Build Scalable Pipelines with PySpark and Optimization

Overview

Customers Also Bought

Details

You May Also Like...

BAM Customer Reviews