Users Online

· Guests Online: 94

· Members Online: 0

· Total Members: 188
· Newest Member: meenachowdary055

Forum Threads

Newest Threads
No Threads created
Hottest Threads
No Threads created

Latest Articles

Microsoft Azure Databricks Administration - Etl Workflow

Microsoft Azure Databricks Administration ETL Workflow

Created by Shantanu Das


01_007 - Databricks CLI and DBFS management

Microsoft Azure Databricks Administration - Etl Workflow

 

Prepare for Azure Databricks Certified Associate Platform Administrator by solving practise questions, Learn core assets

What you'll learn
Learn about Azure Databricks fundamentals, components of databricks like notebooks, cluster, pool, cluster policies,databricks cli, secret management.
How to enable logging in databricks using Azure log analytics workspace libraries, deploy JAR and query logs using Kusto query language for your spark app.
Integrate databricks notebook with Git providers like Github.
Automate administration of Azure Databricks and resources via Terraform for multiple environments.
Learn how to transform smaller datasets in csv, in Scala and SQL and push transformed data into Azure blob storage and databricks table.
Prepare for Azure Databricks Certified Associate Platform Administrator by solving practise questions.
Configure Continuous Integration and Delivery of your spark application using Azure DevOps, datathrust templates.
Run notebook on Azure Databricks via Jobs.
Manage your Databricks cluster using Databricks CLI.

Requirements
Trial Subscription of Azure.
IDE installed - Intellij or Visual Studio code preferred.
Prior no knowledge of data platform is fine.

Description
In this Course, you will learn about spark based Azure Databricks, with more and more data growing daily take it any source be it in csv, text , JSON or any other format the consumption of data has been really easy via different IoT system. mobile phones internet and many other devices.Azure Databricks provides the latest versions of Apache Spark and allows you to seamlessly integrate with open source libraries. Spin up clusters and build quickly in a fully managed Apache Spark environment with the global scale and availability of Azure.Here is the 30,000 ft. overview of the agenda of the course, what will you learn and how you can utilise the learning into a real world data engineering scenario, this course is tailor made for some one who is coming from a background with no prior knowledge of Databricks and Azure and would like to start off their career in data world, specially around administering Azure Databricks.Prepare for Azure Databricks Certified Associate Platform Administrator by solving practise questions.Prepare for interviews and certification by solving quizzes at the end of sessions.1. What is Databricks?2. Databricks Components:a. Notebookb. Clustersc. Poold. Secretse. Databricks CLIf. Cluster Policy3. Automate the entire administration activity via Terraform.4. Mount Azure Blob Storage with Databricks.5. Automate mount Azure Blob Storage with Databricks.6. Load CSV data in Azure blob storage7. Transform data using Scala and SQL queries.8. Load the transform data into Azure blob storage.9. Understand about Databricks tables and filessystem.10. Configure Azure Databricks logging via Log4j and spark listener library via log analytics workspace.11. Configure CI CD using Azure DevOps.12. Git provider intergration.13. Configure notebook deployment via Databricks Jobs.

Overview
Section 1: Getting Started with Azure Databricks

Lecture 1 Azure Databricks Architecture

Lecture 2 Introduction to the Course

Lecture 3 Introduction to Azure Databricks Workspace.

Lecture 4 Databricks Clusters

Lecture 5 Databricks Pools

Lecture 6 Databricks Notebooks and magic commands

Lecture 7 Databricks CLI and DBFS management

Lecture 8 Administrating Cluster via Terraform

Section 2: Cluster Security & Secret Management

Lecture 9 Mount Azure blob storage to load and save data

Lecture 10 Apply try and catch on Scala application

Lecture 11 Automate blob mount via terraform

Lecture 12 Transform data using scala

Lecture 13 Apply filters using SQL

Lecture 14 Push transformed data into Blob storage

Lecture 15 NYC FireStation Data Schema

Lecture 16 Databricks tables and file system.

Lecture 17 aggregate, where, filter, orderby,count functions

Lecture 18 use of Min(),max(),columnrenamed() functions

Lecture 19 Structured Streaming

Section 3: Secret Management

Lecture 20 Secret Management Agenda

Lecture 21 Managing secrets via key vault

Lecture 22 Automate secret management using Terraform

Section 4: Configure Logging using spark log analytics library

Lecture 23 Analyse logs using Log4j

Lecture 24 Configure Azure workspace log analytics

Lecture 25 Use listeners and log analytics JARS to configure logging.

Lecture 26 Apply the INIT script on clusters and stream logs

Lecture 27 Automate workspace analytics and logging via Terraform

Section 5: Databricks Notebook CI CD via Azure DevOps

Lecture 28 Integrate databricks notebook with Git providers like Github.

Lecture 29 Configure Continuous Integration - Artefacts to deployed in clusters.

Lecture 30 Configure Continuous delivery using datathirst templates.

Lecture 31 Run notebook on Azure Databricks via Jobs.

Lecture 32 Secure cluster via cluster policy and permission

Lecture 33 DataFactory LinkedServices

Lecture 34 Orchestrate notebook via DataFactory

Data Engineers,Infrastructure Engineers,Databricks Engineers,DataOps


      
Course Contents
1 - Getting Started with Azure Databricks 2 - Cluster Security Secret Management 3 - Secret Management 4 - Configure Logging using spark log analytics library 5 - Databricks Notebook CI CD via Azure DevOps

Comments

No Comments have been Posted.

Post Comment

Please Login to Post a Comment.

Ratings

Rating is available to Members only.

Please login or register to vote.

No Ratings have been Posted.
Render time: 0.76 seconds
10,837,556 unique visits