Introduction to Hive Course
This course will teach you the Hive query language and how to apply it to solve common Big Data problems. This includes an introduction to distributed computing, Hadoop, and MapReduce fundamentals and the latest features released with Hive 0.11 From developer to analyst, this course tackles a few big questions about big data: Why does this technology exist and why do I need it? How can I get the best out of it utilizing something familiar like SQL and how does this all fit together in an ever-evolving eco-system? This course will introduce the concepts of distributed computing, Hadoop and MapReduce and then goes into great detail into Apache Hive which is an SQL-like query language that can be used with Hadoop and NoSQL databases like HBase and Cassandra. The course presents some challenges you might experience solving real production problems and how Hive makes that task easier to accomplish.
Lesson Description
Section 1 Introduction to Hadoop
Lesson 01 Introduction
Lesson 02 Motivation for Hadoop
Lesson 03 Distributed Computing Challenges
Lesson 04 Hadoop File System ( HDFS )
Lesson 05 Map Reduce
Lesson 06 Word Count Example
Lesson 07 Demo - Basic Hadoop Commands and Environment Setup
Lesson 08 Summary
Section 2 Introduction to Hive
Lesson 09 Introduction
Lesson 10 Hive Motivation
Lesson 11 Hive Architecture
Lesson 12 Hive Principles - Schema on Read
Lesson 13 Hive Principles - The Hive Warehouse
Lesson 14 Hive Query Language Basics - Select and Sub Queries
Lesson 15 Creating Databases and Tables with HiveQL
Lesson 16 Demo Working with Hive Tables and Loading Data into Warehouse
Lesson 17 Loading Data - Hive Managed and External Tablest
Lesson 18 Demo - External Tables and Create Table Alternatives
Lesson 19 Summary
Section 3 Hive Query Language
Lesson 20 Introduction
Lesson 21 Data types
Lesson 22 Type Conversions
Lesson 23 Managed Partition Tables
Lesson 24 External Partitioned Tables
Lesson 25 Demo table Partitioning
Lesson 26 Multi Inserts and Dynamic Partition Inserts
Lesson 27 Demo Loading Data Use Case
Lesson 28 Data Retrieval Group By and Functions
Lesson 29 Sorting and Controlling Data Flow
Lesson 30 The CLI and Variable Substitution
Lesson 31 Summary
Section 4 Advanced HiveQL
Lesson 32 Introduction
Lesson 33 Bucketing
Lesson 34 Bucket and Blocked Sampling
Lesson 35 Joins
Lesson 36 Joins in Depth and Join Optimizations
Lesson 37 Map side Joins for Bucketed Tables
Lesson 38 Distributed Cache
Lesson 39 UDTFs Explode and Lateral View
Lesson 40 Demo - Extending Hive - Creating Your own UDF
Lesson 41 Demo - Extending Hive - Compling and Testing Custom
Lesson 42 Extending Hive Custom UDF Recap
Lesson 43 Demo Hive Initialization File
Lesson 44 Accessing The Distributed Cache
Lesson 45 Hadoop Streaming and Transform()
Lesson 46 Windowing and Analytics Functions
Lesson 47 Demo - Putting it All Together Using Transform
Lesson 48 Demo - Analytics Functions
Lesson 49 Demo - Ranking Functions
Lesson 50 Summary
Section 5 Storage and the Eco-System
Lesson 51 Create Table Statement - File Formats and SerDes
Lesson 52 HCatalog
Lesson 53 Sqoop
Lesson 54 DistCP
Lesson 55 Hadoop - Eco- System Projects
Lesson 56 References and Resources
Lesson 57 Summary