Key info |
Offered by | Udacity |
Description | Learn how to process data in real-time by building fluency in modern data engineering tools, such as Apache Spark, Kafka, Spark Streaming, and Kafka Streaming. You’ll start by understanding the components of data streaming systems. You’ll then build a real-time analytics application. Students will also compile data and run analytics, as well as draw insights from reports generated by the streaming console.
The goal of this course is to grow your expertise in the components of streaming data systems, and build a real time analytics application. Specifically, you will be able to identify components of Spark Streaming (architecture and API), build a continuous application with Structured Streaming, consume and process data from Apache Kafka with Spark Structured Streaming (including setting up and running a Spark Cluster), create a DataFrame as an aggregation of source DataFrames, sink a composite DataFrame to Kafka, and visually inspect a data sink for accuracy.
Learning objectives
- Process data in real-time by building fluency in modern data engineering tools, such as Apache Spark, Kafka, Spark Streaming, and Kafka Streaming.
- Understanding the components of data streaming systems.
- Build a real-time analytics application.
- Compile data and run analytics
- Draw insights from reports generated by the streaming console.
Software: Python
|
Accredited by | Udacity |
URL |
https://www.udacity.com/course/data-streaming-nanodegree--nd029
|