Reading: tidy-data.pdf
The goal of reading this paper is
chp5-handout_r1.pdf
Chapter 5, Foundations for Inference
https://docs.google.com/spreadsheets/d/1Le_A8n8LUNnPUQvdawnMw1ULmdLiuLaV-XPABvZFIGg/edit?usp=sharing
631-wed-midterm-hist.png
readr:
library(readr)
midterm <- read_csv("631_f19_public - Sheet1_r1.csv")
base-r: (no library needed)
midterm2 <- read.csv("631_f19_public - Sheet1_r1.csv")
!is.na(midterm$sequence)
midterm$total[!is.na(midterm$sequence)]
hist(midterm$total[!is.na(midterm$sequence)], main="Histogram of 631, Wednesday", xlab="score including ec")
mean(midterm$total[!is.na(midterm$sequence)]);
sd(midterm$total[!is.na(midterm$sequence)])
median(midterm$total[!is.na(midterm$sequence)])
dir, basename
ldply
plyr vs base-r apply
install.packages("tidyverse")
if we get to it: plyr
https://seananderson.ca/courses/12-plyr/plyr_2012.pdf https://stackoverflow.com/questions/tagged/plyr
comments
from Jessie Zheng:
Really like the paper about tidy data. I was wondering the differences between data cleaning and data tidying. Here are some useful information about it, in case anyone would like to know:)
https://www.idashboards.com/blog/2018/11/14/data-cleaning-vs-data-tidying/
https://www.measureevaluation.org/resources/newsroom/blogs/tidy-data-and-how-to-get-it
From Mohamed Hasan:
Here is the GitHub link for the paper https://github.com/hadley/tidy-data (Links to an external site.)
There is also this video (https://vimeo.com/33727555) where the author explained well about Tidy Data.
Also his other video (https://www.youtube.com/watch?v=TaxJwC_MP9Q) about tidy tools like ggplot, plyr, reshape, etc