Two tibbles are provided that give access to DepMap data, as shared by the Broad Institute's DepMap project on Figshare (https://figshare.com/authors/Broad_DepMap/5514062).
- The [dmsets()] function returns a tibble with DepMap datasets. Each dataset is described by its title, its unique identifier, the number of files it contains, the Figshare URL, and a `DepMapDataset` object that contains further details of the dataset.
- The [dmfiles()] function returns a tibble with DepMap files. Each file is described by its dataset identifier, its own unique identifier, its name, size (in bytes), a download URL, md5 hash and mime type.
- Depmap data files can be downloaded with the [dmget()] function, that takes as input a tibble or data.frame of depmap files such as `dmfiles`. Files are downloaded and automatically in the package's central cache. See [dmCache()].
Usage
DepMapDataset(id)
DepMapFiles(x)
dmFileNames(object)
dmTitle(object)
dmNumFiles(object)
dmget(dmtab, cache = dmCache())
dmfiles()
dmsets()Arguments
- id
- `numeric()` with one or multiple DepMap dataset identifier(s). Note that `id` is converted to an integer. Missing values are not permitted. 
- x
- either an `numeric()` that will be passed to `DepMapDataset` or an instance (or list of) `DepMapDataset`. 
- object
- an instance of class `DepMapDataset`. 
- dmtab
- A `tibble` or `data.frame` containing the file to be downloaded, such as [dmfiles()], or created by [DepMapFiles()]. If is expected to contain the `"name"`, `"id"` and `"dowload_url"` variables. 
- cache
- Object of class [BiocFileCache()]. Default is to use the central `depmap` cache returned by [dmCache()], but users can use their own cache. 
Details
The `DepMapDataset` class stores the informtion describing a depmap dataset, as stored on Figshare (articles, as it's called there). The [DepMapDataset()] constructor requires one or multiple dataset identifiers and returns one or a list of instances.
The following accessors are available: - [dmFileNames()] returns the dataset's filenames. - [dmTitle()] returns the dataset's title. - [dmNumFiles()] returns the number of files in the dataset.
(These are used to construct the main depmap dataset tibble.)
A tibble describing the files in depmap dataset can be cretated with the [DepMapFiles()] function. It either takes one or multiple dataset idenifiers, or one or a list of `DepMapDataset` instances
The [DepMapDataset()] and [DepMapFiles()] functions are mostly used internally, to create the `dmsets` and `dmfiles` tibbles. If a more recent dataset is available on Figshare and not (yet) in the `depmap` package, a user might create the depmap files table to download the files, and/or open a [GitHub issue](https://github.com/UCLouvain-CBIO/depmap) for the new data to be added by the maintainer(s).
All the information is retrieved from Figshare using their API, as described at https://docs.figshare.com.
Adding new datasets
Adding new datasets is simple. Once a new dataset (or Article, as called on Figshare) has been identified on the Broad Institute's [DepMap project on Figshare](https://figshare.com/authors/Broad_DepMap/5514062), one needs to add the dataset's URL to the `depmapURLs` vector in [`inst/extdata/make-dmfiles.R`](https://github.com/UCLouvain-CBIO/depmap/blob/master/inst/scripts/make-dmfiles.R), and re-run the script to update the `dmsets.rds.` and `dmfiles.rds` files in `inst/extdata`.
Feel free to send a GitHub pull request or open a [GitHub issue](https://github.com/UCLouvain-CBIO/depmap) for the new data to be added by the maintainer(s).
Examples
## The depmap datasets
dmsets
#> function () 
#> readRDS(dir(system.file("extdata", package = "depmap"), pattern = "dmsets.rds", 
#>     full.names = TRUE))
#> <bytecode: 0x55659413c4f0>
#> <environment: namespace:depmap>
## The depmap files
dmfiles
#> function () 
#> readRDS(dir(system.file("extdata", package = "depmap"), pattern = "dmfiles.rds", 
#>     full.names = TRUE))
#> <bytecode: 0x55659448fb40>
#> <environment: namespace:depmap>
############################################################
## Mostly for internal use, or to update/generate the depmap
## dataset and files tables.
## One dataset identifier: 24667905
my_dmset <- DepMapDataset(24667905)
my_dmset
#> Title: DepMap 23Q4 Public 
#> Id: 24667905 
#> License: CC BY 4.0 
#> Use `DepMapFiles()` to list 56 files
## Multiple dataset identifiers
my_dmsets <- DepMapDataset(c(24667905, 22765112))
my_dmsets
#> [[1]]
#> Title: DepMap 23Q4 Public 
#> Id: 24667905 
#> License: CC BY 4.0 
#> Use `DepMapFiles()` to list 56 files
#> 
#> [[2]]
#> Title: DepMap 23Q2 Public 
#> Id: 22765112 
#> License: CC BY 4.0 
#> Use `DepMapFiles()` to list 52 files
#> 
## Create the files table from one or dataset multiple dataset
## identifiers
DepMapFiles(24667905)
#> # A tibble: 56 × 7
#>    dataset_id       id name                     size download_url md5   mimetype
#>         <int>    <int> <chr>                   <dbl> <chr>        <chr> <chr>   
#>  1   24667905 43347678 README.txt             2.91e4 https://ndo… 4d2d… text/pl…
#>  2   24667905 43346361 AchillesCommonEssenti… 1.70e4 https://ndo… 1cbf… text/pl…
#>  3   24667905 43346367 AchillesHighVarianceG… 7.07e3 https://ndo… 3ac0… text/pl…
#>  4   24667905 43346370 AchillesNonessentialC… 1.15e4 https://ndo… 9b21… text/pl…
#>  5   24667905 43346379 AchillesScreenQCRepor… 3.16e5 https://ndo… c5dd… text/pl…
#>  6   24667905 43346382 AchillesSequenceQCRep… 4.37e5 https://ndo… 6ade… text/pl…
#>  7   24667905 43346391 AvanaGuideMap.csv      1.60e7 https://ndo… b694… text/pl…
#>  8   24667905 43346409 AvanaLogfoldChange.csv 3.17e9 https://ndo… 58b1… text/pl…
#>  9   24667905 43346505 AvanaRawReadcounts.csv 9.70e8 https://ndo… a7b5… text/pl…
#> 10   24667905 43346574 CRISPRGeneDependency.… 3.94e8 https://ndo… b581… text/pl…
#> # ℹ 46 more rows
DepMapFiles(my_dmset)
#> # A tibble: 56 × 7
#>    dataset_id       id name                     size download_url md5   mimetype
#>         <int>    <int> <chr>                   <dbl> <chr>        <chr> <chr>   
#>  1   24667905 43347678 README.txt             2.91e4 https://ndo… 4d2d… text/pl…
#>  2   24667905 43346361 AchillesCommonEssenti… 1.70e4 https://ndo… 1cbf… text/pl…
#>  3   24667905 43346367 AchillesHighVarianceG… 7.07e3 https://ndo… 3ac0… text/pl…
#>  4   24667905 43346370 AchillesNonessentialC… 1.15e4 https://ndo… 9b21… text/pl…
#>  5   24667905 43346379 AchillesScreenQCRepor… 3.16e5 https://ndo… c5dd… text/pl…
#>  6   24667905 43346382 AchillesSequenceQCRep… 4.37e5 https://ndo… 6ade… text/pl…
#>  7   24667905 43346391 AvanaGuideMap.csv      1.60e7 https://ndo… b694… text/pl…
#>  8   24667905 43346409 AvanaLogfoldChange.csv 3.17e9 https://ndo… 58b1… text/pl…
#>  9   24667905 43346505 AvanaRawReadcounts.csv 9.70e8 https://ndo… a7b5… text/pl…
#> 10   24667905 43346574 CRISPRGeneDependency.… 3.94e8 https://ndo… b581… text/pl…
#> # ℹ 46 more rows
DepMapFiles(c(24667905, 22765112))
#> # A tibble: 108 × 7
#>    dataset_id       id name                     size download_url md5   mimetype
#>         <int>    <int> <chr>                   <dbl> <chr>        <chr> <chr>   
#>  1   24667905 43347678 README.txt             2.91e4 https://ndo… 4d2d… text/pl…
#>  2   24667905 43346361 AchillesCommonEssenti… 1.70e4 https://ndo… 1cbf… text/pl…
#>  3   24667905 43346367 AchillesHighVarianceG… 7.07e3 https://ndo… 3ac0… text/pl…
#>  4   24667905 43346370 AchillesNonessentialC… 1.15e4 https://ndo… 9b21… text/pl…
#>  5   24667905 43346379 AchillesScreenQCRepor… 3.16e5 https://ndo… c5dd… text/pl…
#>  6   24667905 43346382 AchillesSequenceQCRep… 4.37e5 https://ndo… 6ade… text/pl…
#>  7   24667905 43346391 AvanaGuideMap.csv      1.60e7 https://ndo… b694… text/pl…
#>  8   24667905 43346409 AvanaLogfoldChange.csv 3.17e9 https://ndo… 58b1… text/pl…
#>  9   24667905 43346505 AvanaRawReadcounts.csv 9.70e8 https://ndo… a7b5… text/pl…
#> 10   24667905 43346574 CRISPRGeneDependency.… 3.94e8 https://ndo… b581… text/pl…
#> # ℹ 98 more rows
DepMapFiles(my_dmsets)
#> # A tibble: 108 × 7
#>    dataset_id       id name                     size download_url md5   mimetype
#>         <int>    <int> <chr>                   <dbl> <chr>        <chr> <chr>   
#>  1   24667905 43347678 README.txt             2.91e4 https://ndo… 4d2d… text/pl…
#>  2   24667905 43346361 AchillesCommonEssenti… 1.70e4 https://ndo… 1cbf… text/pl…
#>  3   24667905 43346367 AchillesHighVarianceG… 7.07e3 https://ndo… 3ac0… text/pl…
#>  4   24667905 43346370 AchillesNonessentialC… 1.15e4 https://ndo… 9b21… text/pl…
#>  5   24667905 43346379 AchillesScreenQCRepor… 3.16e5 https://ndo… c5dd… text/pl…
#>  6   24667905 43346382 AchillesSequenceQCRep… 4.37e5 https://ndo… 6ade… text/pl…
#>  7   24667905 43346391 AvanaGuideMap.csv      1.60e7 https://ndo… b694… text/pl…
#>  8   24667905 43346409 AvanaLogfoldChange.csv 3.17e9 https://ndo… 58b1… text/pl…
#>  9   24667905 43346505 AvanaRawReadcounts.csv 9.70e8 https://ndo… a7b5… text/pl…
#> 10   24667905 43346574 CRISPRGeneDependency.… 3.94e8 https://ndo… b581… text/pl…
#> # ℹ 98 more rows