Introduction
Since its inception in the mid-2000s, next-generation sequencing (NGS) has become a critically important technology within the scientific community, allowing researchers to draw correlations between diseases and changes in nucleic acid sequences or expression. This has advanced the study of RNA, as analyzing changes in the transcriptome of samples is cheaper and easier than ever before. Whole transcriptome analysis, or RNA sequencing (RNA-seq), is an unbiased tool used to detect and quantify changes in the transcriptome of cells. Specifically, it allows researchers to observe and measure alterations in messenger RNA (mRNA) expression levels, mRNA splicing and quality control mechanisms, and to detect mRNA mutations that may affect protein function. Even with low sample input, RNA-seq provides quantitative results with single-base resolution [
[1]Single-Cell RNA-Seq Technologies and Related Computational Data Analysis.
].
Advances in microfluidics and molecular biology have led to RNA-seq methods being applied at single cell resolution. Single-cell transcriptomics (scRNA-seq) can simultaneously provide the transcriptomes of thousands of individual cells within a sample. This can be especially useful for profiling cellular samples with high degrees of heterogeneity [
[2]- See P.
- Lum J.
- Chen J.
- et al.
A Single-Cell Sequencing Guide for Immunologists.
]. The scRNA-seq methods available can be largely divided into two categories: 1) plate-based full-length sequencing approaches, which generate whole-transcript cDNA sequences (e.g., Smart-seq2), and 2) droplet-based, unique molecular identifier (UMI) labeling methods, where the 3’ end of the mRNA transcripts arising from different cells are uniquely tagged. UMI approaches have several distinct advantages, namely higher throughput due to the ability to multiplex cells and lower costs per cell sequenced [
[2]- See P.
- Lum J.
- Chen J.
- et al.
A Single-Cell Sequencing Guide for Immunologists.
]. The 10X Genomics Single Cell 3’ kit provides UMI-barcoded cDNA from single-cell suspensions using proprietary gel bead-in-emulsion (GEM) technology. In this system, uniquely barcoded gel beads containing reverse transcription reagents are mixed with a limiting dilution of cells. Following cell lysis, reverse transcription generates uniquely labeled cDNA from each cell's polyA tailed mRNA, which is then amplified and carried forward for subsequent processing [
]. Although most scRNA-seq workflows such as the 10X Genomics protocol generate 3’-based libraries via polyA-tail pooling, other methods enabling full-length scRNA-seq libraries have been recently developed and automated . In this study, Mamanova et al. describe two automated methods lasting 3 to 5 days from start to finish, with a 3-4 hour library preparation process [
[4]- Mamanova L
- Miao Z.
- Jinat A.
- et al.
High-Throughput Full-Length Single-Cell RNA-seq Automation.
].
One of the challenges that exists in scRNA-seq workflows is the generation of sequencing libraries following single-cell cDNA generation. Once GEMs are generated and quality check is performed on the cDNA output, sequencing library preparation is laborious and time intensive. It requires precise timing and pipetting proficiency by the operator and can take up to 5 hours depending on sample count. As the number of samples increases, these challenges become difficult to overcome through manual processing. To address these issues, we sought to automate NGS library preparation for scRNA-seq applications. Here we develop and validate a walk-away, automated method that increases throughput, allowing processing of up to 48 samples per run, and decreases the amount of required hands-on time by more than 75% (from 4 hr to 45 min). Single-cell-derived cDNA was used as input to create NGS libraries, either manually or using the automated method, and the resulting libraries were compared following Illumina sequencing using standard NGS and single-cell quality control metrics.
Discussion
Technological advances within the last two decades have led to the ability for researchers to examine transcriptomics on the single-cell level [
[14]Sincle-Cell Transcript Analysis of Pancreas Development.
]. As the number of cells that can be sequenced in a single experiment increases, the sequencing cost per cell decreases, and this has led to several new, commercially available single-cell assay kits that are high-throughput compatible. Each of these has its own strengths and weaknesses, so the workflow selection should be based primarily on the biological system being studied. The first step of these workflows generally involves the isolation of single cells, which can be accomplished using microfluidic droplet-based systems, such as the Chromium (10X Genomics), ddSEQ (BioRad), and Drop-Seq, or using a nanowell microchip system, such as the ICELL8 (Takara Bio). Recent work by Yamawaki and coworkers compared the ability of these systems to generate scRNA-seq libraries from a heterogenous mixture of immune cells. This work found that the Chromium workflow captured significantly more cells than the other cell isolation methods tested, which is an important factor when working with precious samples [
[15]- Yamawaki T.M.
- Lu D.R.
- Ellwanger D.C.
- et al.
Systematic Comparison of High-Throughput Single-Cell RNA-seq Methods for Immune Cell Profiling.
]. Further compared to the other systems mentioned above, the 10X workflow was the most sensitive for mRNA detection, as it produced the most complex libraries with the highest gene counts observed in the study . A similar study compared thirteen different scRNA-seq workflows and benchmarked them according to factors such as gene detection, marker expression, and the ability to identify and cluster different cell types. Again, the Chromium system performed well, only behind Quartz-seq2 [
[16]- Mereu E.
- Lafzi A.
- Moutinho C.
- et al.
Benchmarking Single-Cell RNA-sequencing Protocols for Cell Atlas Projects.
]. Notably, Quartz-seq2 is a plate-based system that uses a cell sorter to isolate single cells [
[17]- Sasagawa Y.
- Danno H.
- Takada H.
- et al.
Quartz-Seq2: A High-Throughput Single-Cell RNA-sequencing Method that Effectively uses Limited Sequence Reads.
]. The microfluidic Chromium system can partition and capture more cells per experiment in a faster manner, so when an experiment requires analysis of a large number of cells, the Chromium system may be preferred.
Single-cell RNA-seq has continuously improved in terms of cost, quality, speed and yield since its inception in 2010, when libraries were generated from 16 cells within six days [
[18]- Tang F.
- Barbacioru C.
- Nordman E.
- et al.
RNA-Seq Analysis to Capture the Transcriptome Landscape of a Single Cell.
]. A decade later, genomics laboratories are routinely processing 5,000 cells per sample and generating libraries within a couple days, including sequencing. In terms of manual handling and time, the bottleneck for scRNA-seq protocols remains the single-cell suspension generation. Tissue dissociation itself may require significant optimization depending on the tissue source, with various dissociation methods available (e.g., enzymatic or mechanical). Even for tissues originating from a single source, we have seen the necessity for optimizing dissociation based on the disease status (e.g., healthy vs fibrotic lung). At this stage and up until single-cell droplet encapsulation, also called GEM generation in the 10X Genomics workflow, we often see the need for handling samples on an individual basis. This work often requires both laboratory operator intervention and user experience in terms of sample inclusion or exclusion based on quality control factors important to scRNA-seq quality, such as cell viability and aggregation rate. As there are so many variables in this process, we suggest there is very little to gain from automating this upstream part of the scRNA-seq workflow. However, when considering the post-GEM generation portion of the workflow, automation becomes an attractive option after cDNA is generated from individual samples, normalized, and stored. At this stage, there is no longer any user intervention needed and processing is identical across all samples, regardless of cell type, tissue source, or disease status. Once cDNA is produced, one could even combine different projects together and generate libraries in a single batch. With this in mind, we focused our efforts on developing and validating a fully walk-away method capable of generating up to 48 scRNA-seq libraries in 3 hours, with end-to-end robotic handling, requiring only 45 minutes of setup time from the laboratory operator.
We first demonstrated that the sequencing of the same scRNA-seq library had a small but detectable run-to-run variability, which we establish to be R=0.973 in our experimental set-up. This was determined by sequencing the same library twice and comparing the transcriptomes of each individual cell. Depending on the cell type or cellular state, there are between 10
5 and 10
6 mRNA molecules present in any given cell, with more than 10,000 expressed genes [
[19]- Hwang B.
- Lee J.H.
- Bang D.
Single-cell RNA Sequencing Technologies and Bioinformatics Pipelines.
]. The molecular diversity of a scRNA-seq library is therefore very high, not only due to transcript diversity, but also to the Unique Molecular Identifiers (UMIs) incorporated into the cDNA, sample indexes needed for sample demultiplexing, and cell barcodes required for cell demultiplexing. One way to mitigate such run-to-run sequencing variability is to increase sequencing depth to reach saturation. Such saturation may be achieved at different sequencing depths, which is dependent on the samples’ mRNA molecular diversity.
Knowing that the variability inherent in the sequencing workflow led to R=0.973, we then looked at the correlation between manual and automated processing. We find the results obtained using the automated method to be virtually identical to the sequencing variability previously observed, as R=0.971. Interestingly, one could extrapolate that, should there be sequencing variability as well as significant library differences due to the automated processing, we would have seen a correlation much lower than the sequencing variability alone. Taken together, this shows that automated scRNA-seq library generation using the Biomek is indistinguishable from manually prepared libraries.
As sequencing costs have significantly decreased, scRNA-seq experiments have scaled up exponentially and become routine [
[20]- Svensson V.
- Vento-Tormo R.
- Teichmann S.A.
Exponential Scaling of Single-Cell RNA-seq in the Past Decade.
]. This scaling up may be further aided by liquid-handling robotics such as the Biomek i7 Workstation platform. Additionally, tagging and multiplexing of samples, also known as cell hashing, occurring prior to droplet encapsulation, can also significantly reduce reagent volumes and consumable costs. This can be done either via antibody- or lipid-based multiplexing methods for both single nuclei and single cells [
[21]- Mylka V.
- Aerts J.
- Matetovici I.
- et al.
Comparative Analysis of Antibody- and Lipid-Based Multiplexing Methods for Single-Cell RNA-seq.
,
[22]- Stoeckius M.
- Hafemeister C.
- Stephenson W.
- et al.
Simultaneous Epitope and Transcriptome Measurement in Single Cells.
. The new 10X Genomics lipid-based multiplexing method for scRNA-seq promises to pool up to 12 samples, in a species-agnostic manner (
https://support.10xgenomics.com/permalink/3RWaZm0kJBmtSjB2UGRieV). Notably, the library processing automated in the present study is compatible with the newly released Cell Multiplexing 10X kit. Ultimately, the use of walk-away automated methods can help reduce users’ hands-on time so that they can attend to other important laboratory tasks.
It should be noted that the present study establishes high correlation between the scRNA-seq manual and automated library processes for macrophages undergoing cytokine-induced polarization. Other cell types, such as small airway epithelial cells (SAECs), were also tested for reproducibility and an even higher correlation was found (R=0.992, n=8 – data not shown). However, further investigation would be needed to establish the exact repeatability of the automated method itself and whether it is higher than that of the manual handling. Furthermore, the 10X single-cell method was written on the Biomek i-Series Software version BM5.1.10 and designed for a Biomek i7 Hybrid instrument. Based on hardware requirements of the method, other Biomek instruments with the deck space to accommodate the automated thermal cycler, an orbital shaker, two Peltiers, and a total of 20 individual deck positions (Supplementary Figure 2), would be capable of performing workflows similar to the automated method described here. Therefore, the method can be adapted to other Biomek i-Series models, such as the i7 Dual Multichannel or the i7 Dual Span-8, as well as smaller instruments, such as the i5 Multichannel or i5 Span-8. In theory, an analogous method could also be adapted on the Biomek 4 software for older Biomek systems (FXP) equipped with the required devices and integrations.
In summary, we have shown automation technologies yield the same high-quality results for single-cell gene expression libraries compared to manual processing, while significantly reducing hands-on time. Further large-scale studies are needed to ensure reproducibility and stability across multiple automated scRNA-seq library generation runs. We expect that in the near future, many labs will have the tools and ability to integrate similar systems and generate scRNA-seq data at scale.
Article info
Publication history
Published online: October 27, 2021
Copyright
© 2021 The Authors. Published by Elsevier Inc. on behalf of Society for Laboratory Automation and Screening.