Patrick Cahan
List the contexts of your current working directory
ls -laht
Make a new directory, or folder, and 'go there':
mkdir ~/bootcamp2016/
cd ~/bootcamp2016
mkdir day1
cd day1
Now launch R
R
In slides, a command (we'll also call them code or a code chunk) will look like this
print("I'm code")
## [1] "I'm code"
And then directly after it, will be the output of the code. So print("I'm code") is the code chunk and [1] "I'm code" is the output.
2 + (2 * 3)^2
## [1] 38
(1 + 3) / 2 + 45
## [1] 47
1:10
## [1] 1 2 3 4 5 6 7 8 9 10
1:10 is equivalent to the seq function
seq(from=1,to=10,by=1)
## [1] 1 2 3 4 5 6 7 8 9 10
To learn more about a function, use help
?seq
Most of the time you want to capture the results of a computation. Variables
x <- rnorm(1e4, mean=0, sd=2) # result does not get sent to output
R's listing function, ls:
ls()
## [1] "x"
Things can be done with variables
mean(x);
## [1] 0.03374283
hist(x);
data.frames are somewhat advanced objects in R
Here we introduce "1 dimensional" classes; these are often referred to as 'vectors'
Vectors can have multiple sets of observations, but each observation has to be the same class
class(x)
## [1] "numeric"
y = "mistakenly, embryomics, smoogy boogy"
print(y)
## [1] "mistakenly, embryomics, smoogy boogy"
class(y)
## [1] "character"
The function c() collects/combines/joins single R objects into a vector of R objects. It is mostly used for creating vectors of numbers, character strings, and other data types.
x <- c(1, 4, 6, 8)
x
## [1] 1 4 6 8
class(x)
## [1] "numeric"
length(): Get or set the length of vectors (including lists) and factors, and of any other R object for which a method has been defined.
length(x)
## [1] 4
y
## [1] "mistakenly, embryomics, smoogy boogy"
length(y)
## [1] 1
x <- rnorm(1e4, mean=0, sd=2)
hist(x)
x <- x + 10
hist(x)
Go to http://tryr.codeschool.com/ and earn your chapter badges!
Each row represents one character from the book series Game of Thrones
Load the data, modify path accordingly...
survdata<-read.csv("../../resources/character-deaths.csv", as.is=TRUE)
dim(survdata)
## [1] 917 13
colnames(survdata)
## [1] "Name" "Allegiances" "Death.Year"
## [4] "Book.of.Death" "Death.Chapter" "Book.Intro.Chapter"
## [7] "Gender" "Nobility" "GoT"
## [10] "CoK" "SoS" "FfC"
## [13] "DwD"
Use ggplot2 to make nice graphics
Install the package if you don't already have it...
install.packages("ggplot2")
Load the library
library(ggplot2)
How many characters per house?
ggplot(survdata, aes(x="Allegiances")) + geom_bar(stat="bin") + theme_bw()
How many characters per house?
ggplot(survdata, aes(x=Allegiances)) + geom_bar(stat="bin") + theme_bw()
How many characters per house?
ggplot(survdata, aes(x=Allegiances)) + geom_bar(stat="bin") + theme_bw() + theme(text = element_text(size=8), axis.text.x = element_text(angle=90, vjust=0.5, hjust=1)) +
theme(axis.title.x = element_blank())+ ylab("") + xlab("") + coord_flip()
length(which(is.na(survdata$Death.Year)))
## [1] 612
length(which(!is.na(survdata$Death.Year)))
## [1] 305
isDead<-rep("Dead", nrow(survdata))
isDead[ which( is.na(survdata$Death.Year)) ] <-"Alive"
survdata<-cbind(survdata, isDead=isDead)
ggplot(survdata, aes(x=Allegiances, fill=as.factor(isDead))) + geom_bar(stat="bin") + theme_bw() + theme(text = element_text(size=8), axis.text.x = element_text(angle=90, vjust=0.5, hjust=1)) +
theme(axis.title.x = element_blank())+ ylab("") + xlab("") + coord_flip()
ggplot(survdata, aes(x=Death.Chapter)) + geom_histogram(colour="black", fill="white") + facet_grid(. ~ Book.of.Death) + theme_bw()
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.