Reading: tidy-data.pdf

The goal of reading this paper is

Lecture handout:

chp5-handout_r1.pdf

Textbook:

Chapter 5, Foundations for Inference

Lecture slides (w/ answers): chp5_r1.pdf

Midterm Results:

https://docs.google.com/spreadsheets/d/1Le_A8n8LUNnPUQvdawnMw1ULmdLiuLaV-XPABvZFIGg/edit?usp=sharing

631-wed-midterm-hist.png

R Topics:

loading csv data from midterm

readr:

library(readr)
midterm <- read_csv("631_f19_public - Sheet1_r1.csv")

base-r: (no library needed)

midterm2 <- read.csv("631_f19_public - Sheet1_r1.csv")
!is.na(midterm$sequence)
midterm$total[!is.na(midterm$sequence)]
hist(midterm$total[!is.na(midterm$sequence)], main="Histogram of 631, Wednesday", xlab="score including ec")
mean(midterm$total[!is.na(midterm$sequence)]);
sd(midterm$total[!is.na(midterm$sequence)])
median(midterm$total[!is.na(midterm$sequence)])

from paper

dir, basename ldply

plyr vs base-r apply

install.packages("tidyverse")

if we get to it: plyr https://seananderson.ca/courses/12-plyr/plyr_2012.pdf https://stackoverflow.com/questions/tagged/plyr

comments

from Jessie Zheng:

Really like the paper about tidy data. I was wondering the differences between data cleaning and data tidying. Here are some useful information about it, in case anyone would like to know:)

https://www.idashboards.com/blog/2018/11/14/data-cleaning-vs-data-tidying/

https://www.measureevaluation.org/resources/newsroom/blogs/tidy-data-and-how-to-get-it

From Mohamed Hasan:

Here is the GitHub link for the paper https://github.com/hadley/tidy-data (Links to an external site.)

There is also this video (https://vimeo.com/33727555) where the author explained well about Tidy Data.

Also his other video (https://www.youtube.com/watch?v=TaxJwC_MP9Q) about tidy tools like ggplot, plyr, reshape, etc