leduc2022_pSCoPE.Rd
Single cell proteomics data acquired by the Slavov Lab. This is the dataset associated to the third version of the preprint. It contains quantitative information of melanoma cells and monocytes at PSM, peptide and protein level. This version of the data was acquired using the pSCoPE MS acquisition approach.
leduc2022_pSCoPE
A QFeatures object with 138 assays, each assay being a SingleCellExperiment object:
Assay 1-134: PSM data acquired with a TMT-18plex protocol, hence those assays contain 18 columns. Columns hold quantitative information from single-cell channels, carrier channels, reference channels, empty (negative control) channels and unused channels.
peptides
: peptide data containing quantitative data for 20,804
peptides and 1556 single-cells. These data have been filtered
to keep high-quality PSMs, all batches have been normalized to
the reference channel, PSMs were aggregated to peptides, and
single-cells with low median coefficient of variation were kept.
peptides_log
: peptide data containing quantitative data for
12,284 peptides and 1543 single-cells. The peptides
data was
further normalized, highly missing peptides were removed and the
quantifications were log-transformed.
proteins_norm2
: protein data containing quantitative data for
2844 proteins and 1543 single-cells. The peptides from
peptides_log
were aggregated to proteins and normalized.
proteins_processed
: protein data containing quantitative data
for 2844 proteins and 1543 single-cells. The proteins_norm2
data were imputed, batch corrected and normalized.
The colData(leduc2022_pSCoPE())
contains cell type annotation,
LC batch information, the TMT label, the MS run ID. We also added
the sample prep annotations provided by the cellenONE dispensing
device (only for single cells): time stamp of cell isolation by the
device, the diameter and elongation of the cell, the ID of the
sample glass side (4 slides in total), the field within the glass
(each slide is divided in 4 field), the pooled well ID (each field
contains 9 pools), the x and y coordinates of each cell dropped in
a field and of each cell pool upon pickup. Finally, we also
retrieved the melanoma subpopulation generated by the authors upon
data analysis. The main population is encoded as A
while the
small population is encoded B
. The description of the rowData
fields for the PSM data can be found in the
MaxQuant
documentation.
The data were downloaded from the
Slavov Lab website.
The raw data and the quantification data can also be found in the
massIVE repository MSV000089159
:
ftp://massive.ucsd.edu/MSV000089159.
The data were acquired using the following setup. More information
can be found in the source article (see References
).
Cell isolation: CellenONE cell sorting.
Sample preparation performed using the improved SCoPE2 protocol using the CellenONE liquid handling system. nPOP cell lysis (DMSO) + trypsin digestion + TMT-18plex labeling and pooling. A target library was generated as well to perform prioritized DDA (Huffman et al. 2022) using MaxQuant.Live (2.0.3).
Separation: online nLC (DionexUltiMate 3000 UHPLC with a 25cm x 75um IonOpticks Aurora Series UHPLC column; 200nL/min).
Ionization: ESI (1,800V).
Mass spectrometry: Thermo Scientific Q-Exactive (MS1 resolution = 70,000; MS2 accumulation time = 300ms; MS2 resolution = 70,000). Prioritized data acquisition was performed using the pSCoPE protocol (Huffman et al. 2022)
Data analysis: MaxQuant (1.6.17.0) + DART-ID
The PSM data were collected from a shared Google Drive folder that
is accessible from the SlavovLab website (see Source
section).
The folder contains the following files of interest:
ev_updated.txt
: the MaxQuant/DART-ID output file
annotation.csv
: sample annotation
batch.csv
: batch annotation
t0.csv
: the processed data table containing the peptides
data
t3.csv
: the processed data table containing the peptides_log
data
t4b.csv
: the processed data table containing the
proteins_norm2
data
t6.csv
: the processed data table containing the
proteins_processed
data
We combined the sample annotation and the batch annotation in
a single table. We also formatted the quantification table so that
columns match with those of the annotations. Both annotation and
quantification tables are then combined in a single QFeatures
object using the scp::readSCP()
function.
The 4 CSV files were loaded and formatted as SingleCellExperiment
objects and the sample metadata were matched to the column names
(mapping is retrieved after running the author's original R script)
and stored in the colData
.
The object is then added to the QFeatures object (containing the
PSM assays) and the rows of the peptide data are linked to the
rows of the PSM data based on the peptide sequence information
through an AssayLink
object.
Andrew Leduc, Gray Huffman, and Nikolai Slavov. 2022. “Droplet Sample Preparation for Single-Cell Proteomics Applied to the Cell Cycle.” bioRxiv. Link to article
Gray Huffman, Andrew Leduc, Christoph Wichmann, Marco di Gioia, Francesco Borriello, Harrison Specht, Jason Derks, et al. 2022. “Prioritized Single-Cell Proteomics Reveals Molecular and Functional Polarization across Primary Macrophages.” bioRxiv. Link to article.
# \donttest{
leduc2022_pSCoPE()
#> see ?scpdata and browseVignettes('scpdata') for documentation
#> loading from cache
#> An instance of class QFeatures containing 138 assays:
#> [1] eAL00219: SingleCellExperiment with 6269 rows and 18 columns
#> [2] eAL00220: SingleCellExperiment with 6603 rows and 18 columns
#> [3] eAL00221: SingleCellExperiment with 6511 rows and 18 columns
#> ...
#> [136] peptides_log: SingleCellExperiment with 12284 rows and 1543 columns
#> [137] proteins_norm2: SingleCellExperiment with 2844 rows and 1543 columns
#> [138] proteins_processed: SingleCellExperiment with 2844 rows and 1543 columns
# }