Single cell proteomics data acquired by the Slavov Lab. This is the dataset associated to the fourth version of the preprint (and the Genome Biology publication). It contains quantitative information of melanoma cells at precursor, peptide and protein level. This version of the data was acquired using the plexDIA MS acquisition protocol.

leduc2022_plexDIA

Format

A QFeatures object with 48 assays, each assay being a SingleCellExperiment object:

  • Assay 1-45: precursor data acquired with a mTRAQ-3 protocol, hence those assays contain 3 columns. Columns hold quantitative information from single cells or negative control samples.

  • Ms1Extracted: the DIA-NN MS1 extracted signal, it combines the information from assays 1-45.

  • peptides: peptide data containing quantitative data for 3,608 peptides and 104 single cells. The data were filtered to 1% protein FDR.

  • proteins: protein data containing quantitative data for 508 proteins and 105 single cells. Note that the peptide and protein data provided by the authors differ by 3 samples. The precursor data were aggregated to protein intensity using maxLFQ. The protein data were further median normalized by column and by row, log2 transformed, impute using KNN (k = 3), again median normalized by column and by row, batch corrected using ComBat, and median normalized by column and by row once more.

The colData(leduc2022_plexDIA()) contains cell type annotation and batch annotation that are common to all assays. The description of the rowData fields for the precursor data can be found in the DIA-NN documentation.

Source

The links to the data were found on the Slavov Lab website. The data were downloaded from the Google drive folder 1 and Google drive folder 2. The raw data and the quantification data can also be found in the massIVE repository MSV000089159: ftp://massive.ucsd.edu/MSV000089159.

Acquisition protocol

The data were acquired using the following setup. More information can be found in the source article (see References).

  • Cell isolation: CellenONE cell sorting.

  • Sample preparation performed using the improved SCoPE2 protocol using the CellenONE liquid handling system. nPOP cell lysis (DMSO) + trypsin digestion + mTRAQ-3 labeling and pooling.

  • Separation: online nLC (DionexUltiMate 3000 UHPLC with a 25cm x 75um IonOpticks Aurora Series UHPLC column; 200nL/min).

  • Ionization: ESI (1,800V).

  • Mass spectrometry: Thermo Scientific Q-Exactive. The duty cycle = 1 MS1 + 4 DIA MS2 windows (120 Th, 120 Th, 200 Th and 580 Th, spanning 378-1,402 m/z). Each MS1 and MS2 scan was conducted at 70,000 resolving power, 3×10E6 AGC and 300ms maximum injection time.

  • Data analysis: DIA-NN.

Data collection

The PSM data were collected from a shared Google Drive folder that is accessible from the SlavovLab website (see Source section). The folder contains the following files of interest:

  • annotation_plexDIA.csv: sample annotation

  • report_plexDIA_mel_nPOP.tsv: the DIA-NN output file with the precursor data

  • report.pr_matrix_channels_ms1_extracted.tsv: the DIA-NN output file with the combined precursor data

  • plexDIA_peptide.csv: the processed data table containing the peptide data

  • plexDIA_protein_imputed.csv: the processed data table containing the protein data

We removed the failed runs as identified by the authors. We also formatted the annotation and precuror quantification tables to facilitate matching between corresponding columns. Both annotation and quantification tables are then combined in a single QFeatures object using scp::readSCPfromDIANN().

The plexDIA_peptide.csv and plexDIA_protein_imputed.csv files were loaded and formatted as SingleCellExperiment objects. The columns names were adapted to match those in the QFeatures object. The SingleCellExperiment objects were then added to the QFeatures object and the rows of the peptide data are linked to the rows of the precursor data based on the peptide sequence or the protein name through an AssayLink object.

References

Andrew Leduc, Gray Huffman, and Nikolai Slavov. 2022. “Droplet Sample Preparation for Single-Cell Proteomics Applied to the Cell Cycle.” bioRxiv. Link to article

Andrew Leduc, Gray Huffman, Joshua Cantlon, Saad Khan, and Nikolai Slavov. 2022. “Exploring Functional Protein Covariation across Single Cells Using nPOP.” Genome Biology 23 (1): 261. Link to article

Jason Derks, Andrew Leduc, Georg Wallmann, Gray Huffman, Matthew Willetts, Saad Khan, Harrison Specht, Markus Ralser, Vadim Demichev, and Nikolai Slavov. 2023. “Increasing the Throughput of Sensitive Proteomics by plexDIA.” Nature Biotechnology 41 (1): 50–59. Link to article

See also

Examples

# \donttest{
leduc2022_plexDIA()
#> see ?scpdata and browseVignettes('scpdata') for documentation
#> loading from cache
#> An instance of class QFeatures containing 48 assays:
#>  [1] D..AL.AL.wAL_plexMel01.raw: SingleCellExperiment with 2495 rows and 3 columns 
#>  [2] D..AL.AL.wAL_plexMel03.raw: SingleCellExperiment with 1738 rows and 3 columns 
#>  [3] D..AL.AL.wAL_plexMel05.raw: SingleCellExperiment with 2299 rows and 3 columns 
#>  ...
#>  [46] Ms1Extracted: SingleCellExperiment with 4682 rows and 135 columns 
#>  [47] peptides: SingleCellExperiment with 3608 rows and 104 columns 
#>  [48] proteins: SingleCellExperiment with 508 rows and 105 columns 
# }