khan2023.Rd
Single-cell samples were prepared using the nPOP sample preparation method. Proteomics data were acquired using the SCoPE2 protocol on a Thermo Scientific Q-Exactive mass spectrometer. The dataset contains quantitative information on 421 MCF-10A single cells undergoing epithelial–mesenchymal transition (EMT) triggered by TGF beta. The data are available at the PSM, and protein levels. The paper investigates the dynamics of correlation modules at the protein level.
khan2023
A QFeatures object with 47 assays, each assay being a SingleCellExperiment object:
Assay 1-44: PSM data acquired with a TMTPro 16plex protocol, hence those assays contain 16 columns. Columns hold quantitative information from single-cell channels, carrier channels, reference channels, empty (negative control) channels and unused channels.
peptides
: peptide data containing quantitative data for 10055
peptides and 421 single-cells.
proteins_imputed
: protein data containing quantitative data for 4096
proteins and 421 single-cells with k-nearest neighbors (KNN) imputation.
proteins_unimputed
: protein data containing quantitative data for 4096
proteins and 421 single-cells without imputation.
The colData(khan2023())
contains cell type and batch annotations that
are common to all assays. The description of the rowData
fields for the
PSM data can be found in the
MaxQuant
documentation.
The data were downloaded from the
Slavov Lab website via a
shared Google Drive
folder.
The raw data and the quantification data can also be found in the
MassIVE repository MSV000092872
:
ftp://MSV000092872@massive.ucsd.edu/.
The data were acquired using the following setup. More information
can be found in the source article (see References
).
Cell isolation: CellenONE cell sorting.
Sample preparation performed using the SCoPE2 protocol. nPOP cell lysis (DMSO) + trypsin digestion + TMTPro 16plex protocol.
Separation: online nLC (DionexUltiMate 3000 UHPLC with a 25cm x 75um IonOpticks Odyssey Series column (ODY3-25075C18); 200nL/min).
Ionization: ESI (1,700 V).
Mass spectrometry: Thermo Scientific Q-Exactive (MS1 resolution = 70,000; MS1 accumulation time = 300ms; MS2 resolution = 70,000).
Data analysis: MaxQuant(2.4.13.0) + DART-ID.
The PSM data were collected from a shared Google Drive folder that
is accessible from the SlavovLab website (see Source
section).
The folder ('/002-singleCellDataGeneration') contains the following
files of interest:
ev_updated_NS.DIA.txt
: the MaxQuant/DART-ID output file
annotation.csv
: sample annotation
batch.csv
: batch annotation
We combined the sample annotation and the batch annotation in
a single table. We also formatted the quantification table so that
columns match with those of the annotation and filter only for
single-cell runs. Both table are then combined in a single
QFeatures object using the scp::readSCP()
function.
The peptide data were taken from the same google drive folder
(EpiToMesen.TGFB.nPoP_trial1_pepByCellMatrix_NSThreshDART_medIntCrNorm.txt
).
The data were formatted to a SingleCellExperiment object and the sample
metadata were matched to the column names (mapping is retrieved
after running the SCoPE2 R script, EMTTGFB_singleCellProcessing.R
) and
stored in the colData
. The object is then added to the QFeatures object
and the rows of the PSM data are linked to the rows of the peptide data
based on the peptide sequence information through an AssayLink
object.
The imputed protein data were taken from the same google drive folder
(EpiToMesen.TGFB.nPoP_trial1_ProtByCellMatrix_NSThreshDART_medIntCrNorm_imputedNotBC.csv
).
The data were formatted to a SingleCellExperiment object and the sample
metadata were matched to the column names (mapping is retrieved
after running the SCoPE2 R script, EMTTGFB_singleCellProcessing.R
) and
stored in the colData
. The object is then added to the QFeatures object
and the rows of the peptide data are linked to the rows of the protein data
based on the protein sequence information through an AssayLink
object.
The unimputed protein data were taken from the same google drive folder
(EpiToMesen.TGFB.nPoP_trial1_ProtByCellMatrix_NSThreshDART_medIntCrNorm_unimputed.csv
).
The data were formatted and added exactly as imputed data.
Saad Khan, Rachel Conover, Anand R. Asthagiri, Nikolai Slavov. 2023. "Dynamics of single-cell protein covariation during epithelial–mesenchymal transition." bioRxiv. (link to article).
# \donttest{
khan2023()
#> see ?scpdata and browseVignettes('scpdata') for documentation
#> loading from cache
#> An instance of class QFeatures containing 47 assays:
#> [1] eSK233: SingleCellExperiment with 5951 rows and 16 columns
#> [2] eSK234: SingleCellExperiment with 6375 rows and 16 columns
#> [3] eSK235: SingleCellExperiment with 6261 rows and 16 columns
#> ...
#> [45] peptides: SingleCellExperiment with 10055 rows and 421 columns
#> [46] proteins_imputed: SingleCellExperiment with 4096 rows and 421 columns
#> [47] proteins_unimputed: SingleCellExperiment with 4096 rows and 421 columns
# }