Users Online

· Guests Online: 91

· Members Online: 0

· Total Members: 188
· Newest Member: meenachowdary055

Forum Threads

Newest Threads
No Threads created
Hottest Threads
No Threads created

Latest Articles

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka

Extending Hadoop for Data Science: Streaming, Spark, Storm, and Kafka

with Lynn Langit


01_02-What you should know



Extend your Hadoop data science knowledge by learning how to use other Apache data science platforms, libraries, and tools. This course goes beyond the basics of Hadoop MapReduce, into other key Apache libraries to bring flexibility to your Hadoop clusters. Coverage of core Spark, SparkSQL, SparkR, and SparkML is included. Learn how to scale and visualize your data with interactive Databricks clusters and notebooks and other implementations. This course is designed to help those working data science, development, or analytics get familiar with attendant technologies.

Topics Include:
  • Relate which file system is typically used with Hadoop.
  • Explain the differences between Apache and commercial Hadoop distributions
  • Cite how to set up IDE - VS Code + Python extension
  • Relate the value of Databricks community edition.
  • Compare YARN vs. Standalone.
  • Review various streaming options.
  • Recall how to select your programming language.
  • Describe the Databricks environment.

      
Course Contents
01. Introduction 02. Hadoop Core Fundamentals 03. Setting Up a Hadoop Dev Environment 04. Hadoop Batch Processing 05. Fast Hadoop Options 06. Spark Basics 07. Using Spark 08. Spark Libraries 09. Spark Streaming 10. Hadoop Streaming 11. Modern Hadoop Architectures 12. Conclusion Exercice Files

Comments

No Comments have been Posted.

Post Comment

Please Login to Post a Comment.

Ratings

Rating is available to Members only.

Please login or register to vote.

No Ratings have been Posted.
Render time: 0.74 seconds
10,261,729 unique visits