Chapter 1 R refresher

The objectives of this chapter is to review some R syntax, functions and data structures that will be needed for the following chapters.

1.1 Administration

BiocManager::install("UCLouvain-CBIO/rWSBIM1322")

1.2 Basic data structures and operations

Summary

number of dimensions number of data types
vector 1 (length) 1
matrix 2 1
array n 1
dataframe 2 n
list 1 (length) n

1.3 Tidyverse

  • The dplyr package
  • Piping
  • Wide and long data, and their conversion with the pivot_longer and pivot_wider functions.

1.4 Saving and exporting

  • saveRDS() and readRDS() binary data.
  • Exporting data with write.csv and read.csv (or write_csv and read_csv) and same for other types of spreadsheets.
  • Saving figures (ggsave and file devices such as png(), pdf(), …).
  • Package versions: sessionInfo()

1.5 Programming

1.6 Additional exercises

► Question

Complete the following function. It is supposed to take two inputs, x and y and, depending whether the x > y or x <= y, it generates the permutation sample(x, y) in the first case or draws a sample from rnorm(1000, x, y) in the second case. Finally, it returns the sum of all values.

fun <- function(x, y) {
    res <- NA
    if () { ## 1
        res <- sample(,) ## 2
    } else {
        res <- rnorm(, , ) ## 3
    }
    return() ## 4
}

► Question

Read the interro2.rds from the rWSBIM1207 package (version 0.1.9 of later) file into R. The path to the file can be found with the rWSBIM1207::interro2.rds() function.

This dataframe provides the scores for 4 tests for 10 students.

  • Write a function that calculates the average score for the 3 best tests.
  • Calculate this average score for the 10 students.

This can be done using the apply function or using dplyr functions. For the latter, see also rowwise().

Note the situation of students that have only presented 3 out of 4 tests (i.e they have a NA for one test). It is up to you to decide whether you simply take the mean of the 3, or whether you prefer to drop the worst of 3 and calculate the mean of the 2 best marks. Make sure you are aware of what your implementation returns and, ideally, state it explicitly in your response.

Page built: 2023-11-27 using R version 4.3.1 Patched (2023-07-10 r84676)