Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
Posted by Superadmin on November 15 2020 15:53:54

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

01_01-Welcome

Extend your Hadoop data science knowledge by learning how to use other Apache data science platforms, libraries, and tools. This course goes beyond the basics of Hadoop MapReduce, into other key Apache libraries to bring flexibility to your Hadoop clusters. Coverage of core Spark, SparkSQL, SparkR, and SparkML is included. Learn how to scale and visualize your data with interactive Databricks clusters and notebooks and other implementations. This course is designed to help those working data science, development, or analytics get familiar with attendant technologies.

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

01_02-What you should know

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

01_03-Using the exercise files

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

02_01-Modern Hadoop

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

02_02-File system used with Hadoop

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

02_03-Apache and commerical Hadoop distributions

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

02_04-Hadoop libraries

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

02_05-Hadoop on Google Cloud Platform

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

02_06-Run Hadoop job on GCP

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

02_07-Databricks on AWS

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

03_01-Set up IDE VS Code Python extension

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

03_02-Sign up for Databricks community edition

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

03_03-Add Hadoop libraries to your test environment

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

03_04-Your first cluster on Databricks Community Edition

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

03_05-Load data into tables

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

04_01-Processing options

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

04_02-Prerequisite understanding

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

04_03-Resource coordinators

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

04_04-Compare YARN vs. Standalone

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

05_01-Fast Hadoop use cases

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

05_02-Big data streaming

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

05_03-Streaming options

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

05_04-Apache Spark basics

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

05_05-Spark use cases

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

06_01-Apache Spark libraries

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

06_02-Spark data interfaces

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

06_03-Select your programming language

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

06_04-Spark session objects

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

06_05-Spark shell

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

07_01-Tour the DataBricks Environment

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

07_02-Tour the notebook

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

07_03-Import and export notebooks

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

07_04-Calculate pi on Spark

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

07_05-Run wordcount of Spark with Scala

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

07_06-Understand wordcount on Spark with Python

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

07_07-Import data

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

07_08-Transformations and actions

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

07_09-Caching and the DAG

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

08_01-Spark SQL

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

08_02-SparkR

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

08_03-Spark ML_ Preparing data

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

08_04-Spark ML_ Building the model

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

08_05-Spark ML_ Evaluating the model

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

08_06-Advanced machine learning on Spark

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

08_07-MXNet or TensorFlow

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

08_08-Spark with GraphX

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

08_09-Spark with ADAM for genomics

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

09_01-Reexamine streaming pipelines

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

09_02-Spark streaming

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

09_03-Streaming ingest services

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

09_04-Advanced Spark streaming with MLeap

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

10_01-PubSub on GCP

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

10_02-Apache Kafka

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

10_03-Kafka architecture

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

10_04-Apache Storm

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

10_05-Storm architecture

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

11_01-Combine Hadoop libraries and more

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

11_02-Review batch architecture for ETL

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

11_03-Spark architecture for interactive analytics

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

11_04-Spark architecture for genomics

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

11_05-Spark Streaming architecture for IoT

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

11_06-Spark Streaming architecture for dynamic prediction

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

12_01-Next steps

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Ex_Files_Extending_Hadoop

Topics Include:

Relate which file system is typically used with Hadoop.
Explain the differences between Apache and commercial Hadoop distributions
Cite how to set up IDE - VS Code + Python extension
Relate the value of Databricks community edition.
Compare YARN vs. Standalone.
Review various streaming options.
Recall how to select your programming language.
Describe the Databricks environment.

Course Contents

01. Introduction

02. Hadoop Core Fundamentals

03. Setting Up a Hadoop Dev Environment

04. Hadoop Batch Processing

05. Fast Hadoop Options

06. Spark Basics

07. Using Spark

08. Spark Libraries

09. Spark Streaming

10. Hadoop Streaming

11. Modern Hadoop Architectures

12. Conclusion

12_01-Next steps

Exercice Files

Ex_Files_Extending_Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafkawith Lynn Langit

01_01-Welcome

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafkawith Lynn Langit

01_02-What you should know

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafkawith Lynn Langit

01_03-Using the exercise files

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafkawith Lynn Langit

02_01-Modern Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafkawith Lynn Langit

02_02-File system used with Hadoop

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafkawith Lynn Langit

02_03-Apache and commerical Hadoop distributions

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafkawith Lynn Langit

02_04-Hadoop libraries

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafkawith Lynn Langit

02_05-Hadoop on Google Cloud Platform

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafkawith Lynn Langit

02_06-Run Hadoop job on GCP

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafkawith Lynn Langit

02_07-Databricks on AWS

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafkawith Lynn Langit

03_01-Set up IDE VS Code Python extension

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafkawith Lynn Langit

03_02-Sign up for Databricks community edition

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafkawith Lynn Langit

03_03-Add Hadoop libraries to your test environment

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafkawith Lynn Langit

03_04-Your first cluster on Databricks Community Edition

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafkawith Lynn Langit

03_05-Load data into tables

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafkawith Lynn Langit

04_01-Processing options

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafkawith Lynn Langit

04_02-Prerequisite understanding

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafkawith Lynn Langit

04_03-Resource coordinators

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafkawith Lynn Langit

04_04-Compare YARN vs. Standalone

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafkawith Lynn Langit

05_01-Fast Hadoop use cases

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafkawith Lynn Langit

05_02-Big data streaming

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafkawith Lynn Langit

05_03-Streaming options

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafkawith Lynn Langit

05_04-Apache Spark basics

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafkawith Lynn Langit

05_05-Spark use cases

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafkawith Lynn Langit

06_01-Apache Spark libraries

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafkawith Lynn Langit

06_02-Spark data interfaces

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafkawith Lynn Langit

06_03-Select your programming language

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafkawith Lynn Langit

06_04-Spark session objects

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafkawith Lynn Langit

06_05-Spark shell

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafkawith Lynn Langit

07_01-Tour the DataBricks Environment

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafkawith Lynn Langit

07_02-Tour the notebook

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafkawith Lynn Langit

07_03-Import and export notebooks

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafkawith Lynn Langit

07_04-Calculate pi on Spark

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafkawith Lynn Langit

07_05-Run wordcount of Spark with Scala

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafkawith Lynn Langit

07_06-Understand wordcount on Spark with Python

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafkawith Lynn Langit

07_07-Import data

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafkawith Lynn Langit

07_08-Transformations and actions

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafkawith Lynn Langit

07_09-Caching and the DAG

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafkawith Lynn Langit

08_01-Spark SQL

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafkawith Lynn Langit

08_02-SparkR

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka
with Lynn Langit