Input R-studio
Practicum 1
nrow(myData)
ncol(myData)
dim(myData)
A vector is the elementary structure for data handling in R. It is a set of
simple elements, all being objects of the same class.
1:3
seq(from=1, to=3, by = 1)
c(1,2,3)
A factor is a data structure for categorical variables. The levels of the
factor can be extracted by
Levels(myData$gender)
You can assign more descriptive names to the factor levels in the
following way.
myData$workshop <- factor(myData$workshop, levels = c(1,2,3) , labels
= c("R","SAS","SPSS") )
4 main types (or classes) of variables
- Numeric: integer or floating point
- Character: text string (names)
• (for example: myData$ID <- as.character(myData$ID)
• Make a character vector yourself:
• c(“john”,”paul”,”george”)
• as.character
- Factor:
• Categorical variable with limited number of levels
• Ordered or nat
- Logical: TRUE/FALSE
Exporting a modified table: write.table(myData, file =
"myFirstOutputFile")
Make a matrix: demo.matrix<-matrix(1:12, nrow =3,byrow=T)
Example :
genetics <-matrix(c(230,198,72,249,207,44),nrow=2,byrow=T)
colnames(genetics) <- c("AA","AG","GG")
rownames(genetics) <- c("Case","Control")
to make:
AA AG GG
Case 230 198 72
Control 249 207 44
1
, Sorting data :
order(myData$exam)
o <- order(myData$exam)
myData[o,]
Conditional selection: selection related to indexing, but instead of a
row/column number we now put a condition that the row/column elements
need to fulfill to be selected
myData[myData$workshop == "SPSS" & myData$sex == "female" ,
]
select <- myData$workshop == "SPSS"
select
myData[select,]
-> the comma means that we are selecting a subset of rows, while all the
columns are included
The subset function is doing almost the same but observations
with missing values are omitted (weggelaten): subset(myData,
exam>10)
Splitting, stacking and merging files
Stacking: rbind() function
myData.male <- myData[myData$sex == "male",]
myData.female <- myData[myData$sex == "female" ,]
rbind(myData.male,myData.female)
Split: split()
split(myData, myData$sex)
when you don’t only want to see the splitted data, but you also want
your data to be splitted in your library
o myData.split<-split(myData,myData$sex)
Plotting in R
Pch = plotting characters
Lty = line type
(0=blank, 1=solid (default), 2=dashed, 3=dotted, 4=dotdash,
5=longdash, 6=twodash)
When you want to apply a color to the data for example a different color
for different species:
plot(Petal.Length~Sepal.Length,data=irisData,col=irisData$Species)
Tapply function can be used if you want to have a visual representation
of the ‘mean…’
Used in combination with barplot function
2