The objectives of this chapter is to review some R syntax, functions and data structures that will be needed for the following chapters.
.RData
file).Summary
number of dimensions | number of data types | |
---|---|---|
vector | 1 (length) | 1 |
matrix | 2 | 1 |
array | n | 1 |
dataframe | 2 | n |
list | 1 (length) | n |
saveRDS()
and readRDS()
binary data.write.csv
and read.csv
(or write_csv
and read_csv
) and
same for other types of spreadsheets.ggsave
and file devices such as png()
, pdf()
, …).sessionInfo()
if
/else
for
loops and apply
functions► Question
Complete the following function. It is supposed to take two inputs,
x
and y
and, depending whether the x > y
or x <= y
, it
generates the permutation sample(x, y)
in the first case or draws a
sample from rnorm(1000, x, y)
in the second case. Finally, it
returns the sum of all values.
fun <- function(x, y) {
res <- NA
if () { ## 1
res <- sample(,) ## 2
} else {
res <- rnorm(, , ) ## 3
}
return() ## 4
}
► Question
Read the interro2.rds
from the rWSBIM1207
package (version 0.1.9
of later) file into R. The path to the file can be found with the
rWSBIM1207::interro2.rds()
function.
This dataframe provides the scores for 4 tests for 10 students.
This can be done using the apply
function or using dplyr
functions. For the latter, see also rowwise()
.
Note the situation of students that have only presented 3 out of
4 tests (i.e they have a NA
for one test). It is up to you to decide
whether you simply take the mean of the 3, or whether you prefer to
drop the worst of 3 and calculate the mean of the 2 best marks. Make
sure you are aware of what your implementation returns and, ideally,
state it explicitly in your response.
► Question
Create a matrix of dimenions 100 by 100 containing data from a normal distribution of mean 0 and standard deviation 1.
Compute the means of each row and each column using the apply()
and rowMeans()
/colMeans()
functions. Confirm that both provide
the same results.
Compute the difference between the column means and the row means. Does the result make sense?
► Question
Using the data kem2_se
data from the rWSBIM1322
package, compute
de delta values or each gene (delta is the difference between the
highest and lowest values). To do this, write a function delta()
that takes a vector of numerics as input and returns the delta
value, and apply it on the object’s assay.
Re-use your delta()
function and apply it on each sample.
Page built: 2024-10-28 using R version 4.4.1 (2024-06-14)