Data Wrangling in R
Posted by Superadmin on February 09 2019 05:07:09

Data Wrangling in R

Tidy data is a data format that provides a standardized way of organizing data values within a dataset. By leveraging tidy data principles, statisticians, analysts, and data scientists can spend less time cleaning data and more time tackling the more compelling aspects of data analysis. In this course, learn about the principles of tidy data, and discover how to create and manipulate data tibbles—transforming them from source data into tidy formats. Instructor Mike Chapple uses the R programming language and the tidyverse packages to teach the concept of data wrangling—the data cleaning and data transformation tasks that consume a substantial portion of analysts' time. He wraps up with three hands-on case studies that help to reinforce the data wrangling principles and tactics covered in this course.

Topics include:

What's tidy data?
Using the tidyverse
Working with tibbles
Subsetting and filtering tibbles
Importing data into R
Making wide datasets long with gather()
Making long datasets wide with spread()
Converting data types in R
Detecting outliers
Manipulating strings in R with stringr

00. Introduction


001 Welcome	002 What you need to know	003 Using the exercise files

1. Tidy Data


004 What is tidy data_	005 Variables, observations, and values	006 Common data problems	007 Using the tidyverse

02. Working with tibbles


008 Building and printing tibbles	009 Subsetting tibbles	010 Filtering tibbles

03. Importing Data into R


011 What are CSV files_	012 Importing CSV files into R	013 What are TSV files_	014 Importing TSV files into R

015 Importing delimited files into R	016 Importing fixed-width files into R	017 Importing Excel files into R	018 Reading data from databases and the web

04. Data Transformation


019 Wide vs. long datasets	020 Making wide datasets long with gather()	021 Making long datasets wide with spread()	022 Converting data types in R

023 Working with dates and times in R

05. Data Cleaning


024 Detecting outliers	025 Missing and special values in R	026 Breaking apart columns with separate()	027 Combining columns with unite()

028 Manipulating strings in R with stringr

06. Data Wrangling Case Study : Coal Consumption


029 Understanding the coal dataset	030 Reading in the coal dataset	031 Converting the coal dataset from long to wide	032 Segmenting the coal dataset

033 Visualizing the coal dataset

07. Data Wrangling Case Study - Water Quality


034 Understanding the water quality dataset	035 Reading in the water quality dataset	036 Filtering the water quality dataset	037 Water quality data types

038 Correcting data entry errors	039 Identifying and removing outliers	040 Converting temperature from Fahrenheit to Celsius	041 Widening the water quality dataset

08. Data Wrangling Case Study : Social Security Disability Claims


042 Understanding the Social Security Disability dataset	043 Importing the Social Security Disability Data Set	044 Making the Social Security Disability dataset long	045 Formatting dates in the Social Security Disability dataset

046 Handling fiscal years in the Social Security Disability dataset	047 Widening the Social Security Disability dataset	048 Visualizing the Social Security Disability dataset	049 Next steps