Tidy data is a data format that provides a standardized way of organizing data values within a dataset. By leveraging tidy data principles, statisticians, analysts, and data scientists can spend less time cleaning data and more time tackling the more compelling aspects of data analysis. In this course, learn about the principles of tidy data, and discover how to create and manipulate data tibbles—transforming them from source data into tidy formats. Instructor Mike Chapple uses the R programming language and the tidyverse packages to teach the concept of data wrangling—the data cleaning and data transformation tasks that consume a substantial portion of analysts' time. He wraps up with three hands-on case studies that help to reinforce the data wrangling principles and tactics covered in this course.
00. Introduction
|
|
|
|
001 Welcome
|
002 What you need to know
|
003 Using the exercise files
|
1. Tidy Data
|
|
|
|
004 What is tidy data_
|
005 Variables, observations, and values
|
006 Common data problems
|
007 Using the tidyverse
|
02. Working with tibbles
|
|
|
|
008 Building and printing tibbles
|
009 Subsetting tibbles
|
010 Filtering tibbles
|
03. Importing Data into R
|
|
|
|
011 What are CSV files_
|
012 Importing CSV files into R
|
013 What are TSV files_
|
014 Importing TSV files into R
|
|
|
|
|
015 Importing delimited files into R
|
016 Importing fixed-width files into R
|
017 Importing Excel files into R
|
018 Reading data from databases and the web
|
04. Data Transformation
|
|
|
|
019 Wide vs. long datasets
|
020 Making wide datasets long with gather()
|
021 Making long datasets wide with spread()
|
022 Converting data types in R
|
|
|||
023 Working with dates and times in R
|
05. Data Cleaning
|
|
|
|
024 Detecting outliers
|
025 Missing and special values in R
|
026 Breaking apart columns with separate()
|
027 Combining columns with unite()
|
|
|||
028 Manipulating strings in R with stringr
|
06. Data Wrangling Case Study : Coal Consumption
|
|
|
|
029 Understanding the coal dataset
|
030 Reading in the coal dataset
|
031 Converting the coal dataset from long to wide
|
032 Segmenting the coal dataset
|
|
|||
033 Visualizing the coal dataset
|
07. Data Wrangling Case Study - Water Quality
|
|
|
|
034 Understanding the water quality dataset
|
035 Reading in the water quality dataset
|
036 Filtering the water quality dataset
|
037 Water quality data types
|
|
|
|
|
038 Correcting data entry errors
|
039 Identifying and removing outliers
|
040 Converting temperature from Fahrenheit to Celsius
|
041 Widening the water quality dataset
|
08. Data Wrangling Case Study : Social Security Disability Claims
|
|
|
|
042 Understanding the Social Security Disability dataset
|
043 Importing the Social Security Disability Data Set
|
044 Making the Social Security Disability dataset long
|
045 Formatting dates in the Social Security Disability dataset
|
|
|
|
|
046 Handling fiscal years in the Social Security Disability dataset
|
047 Widening the Social Security Disability dataset
|
048 Visualizing the Social Security Disability dataset
|
049 Next steps
|