The objectives of this chapter is to review some R syntax, functions and data structures that will be needed for the following chapters.
.RData file).Summary
| number of dimensions | number of data types | |
|---|---|---|
| vector | 1 (length) | 1 |
| matrix | 2 | 1 |
| array | n | 1 |
| dataframe | 2 | n |
| list | 1 (length) | n |
saveRDS() and readRDS() binary data.write.csv and read.csv (or write_csv and read_csv) and
same for other types of spreadsheets.ggsave and file devices such as png(), pdf(), …).sessionInfo()
if/else
for loops and apply functions► Question
Complete the following function. It is supposed to take two inputs,
x and y and, depending whether the x > y or x <= y, it
generates the permutation sample(x, y) in the first case or draws a
sample from rnorm(1000, x, y) in the second case. Finally, it
returns the sum of all values.
fun <- function(x, y) {
res <- NA
if () { ## 1
res <- sample(,) ## 2
} else {
res <- rnorm(, , ) ## 3
}
return() ## 4
}
► Question
Read the interro2.rds from the rWSBIM1207 package (version 0.1.9
of later) file into R. The path to the file can be found with the
rWSBIM1207::interro2.rds() function.
This dataframe provides the scores for 4 tests for 10 students.
This can be done using the apply function or using dplyr
functions. For the latter, see also rowwise().
Note the situation of students that have only presented 3 out of
4 tests (i.e they have a NA for one test). It is up to you to decide
whether you simply take the mean of the 3, or whether you prefer to
drop the worst of 3 and calculate the mean of the 2 best marks. Make
sure you are aware of what your implementation returns and, ideally,
state it explicitly in your response.
► Question
Create a matrix of dimenions 100 by 100 containing data from a normal distribution of mean 0 and standard deviation 1.
Compute the means of each row and each column using the apply()
and rowMeans()/colMeans() functions. Confirm that both provide
the same results.
Compute the difference between the column means and the row means. Does the result make sense?
► Question
Using the data kem2_se data from the rWSBIM1322 package, compute
de delta values or each gene (delta is the difference between the
highest and lowest values). To do this, write a function delta()
that takes a vector of numerics as input and returns the delta
value, and apply it on the object’s assay.
Re-use your delta() function and apply it on each sample.
Page built: 2025-10-15 using R version 4.5.0 (2025-04-11)