Advertisement

Automation enables high-throughput and reproducible single-cell transcriptomics library preparation

Open AccessPublished:October 27, 2021DOI:https://doi.org/10.1016/j.slast.2021.10.018

      Abstract

      Next-generation sequencing (NGS) has revolutionized genomics, decreasing sequencing costs and allowing researchers to draw correlations between diseases and DNA or RNA changes. Technical advances have enabled the analysis of RNA expression changes between single cells within a heterogeneous population, known as single-cell RNA-seq (scRNA-seq). Despite resolving transcriptomes of cellular subpopulations, scRNA-seq has not replaced RNA-seq, due to higher costs and longer hands-on time. Here, we developed an automated workflow to increase throughput (up to 48 reactions) and to reduce by 75% the hands-on time of scRNA-seq library preparation, using the 10X Genomics Single Cell 3’ kit. After gel bead-in-emulsion (GEM) generation on the 10X Genomics Chromium Controller, cDNA amplification was performed, and the product was normalized and subjected to either the manual, standard library preparation method or a fully automated, walk-away method using a Biomek i7 Hybrid liquid handler. Control metrics showed that both quantity and quality of the single-cell gene expression libraries generated were equivalent in size and yield. Key scRNA-seq downstream quality metrics, such as unique molecular identifiers count, mitochondrial RNA content, and cell and gene counts, further showed high correlations between automated and manual workflows. Using the UMAP dimensionality reduction technique to visualize all cells, we were able to further correlate the results observed between the manual and automated methods (R=0.971). The method developed here allows for the fast, error-free, and reproducible multiplex generation of high-quality single-cell gene expression libraries.

      Graphical abstract

      Keywords

      Introduction

      Since its inception in the mid-2000s, next-generation sequencing (NGS) has become a critically important technology within the scientific community, allowing researchers to draw correlations between diseases and changes in nucleic acid sequences or expression. This has advanced the study of RNA, as analyzing changes in the transcriptome of samples is cheaper and easier than ever before. Whole transcriptome analysis, or RNA sequencing (RNA-seq), is an unbiased tool used to detect and quantify changes in the transcriptome of cells. Specifically, it allows researchers to observe and measure alterations in messenger RNA (mRNA) expression levels, mRNA splicing and quality control mechanisms, and to detect mRNA mutations that may affect protein function. Even with low sample input, RNA-seq provides quantitative results with single-base resolution [
      • Chen G.
      • Ning B.
      • Shi T.
      Single-Cell RNA-Seq Technologies and Related Computational Data Analysis.
      ].
      Advances in microfluidics and molecular biology have led to RNA-seq methods being applied at single cell resolution. Single-cell transcriptomics (scRNA-seq) can simultaneously provide the transcriptomes of thousands of individual cells within a sample. This can be especially useful for profiling cellular samples with high degrees of heterogeneity [
      • See P.
      • Lum J.
      • Chen J.
      • et al.
      A Single-Cell Sequencing Guide for Immunologists.
      ]. The scRNA-seq methods available can be largely divided into two categories: 1) plate-based full-length sequencing approaches, which generate whole-transcript cDNA sequences (e.g., Smart-seq2), and 2) droplet-based, unique molecular identifier (UMI) labeling methods, where the 3’ end of the mRNA transcripts arising from different cells are uniquely tagged. UMI approaches have several distinct advantages, namely higher throughput due to the ability to multiplex cells and lower costs per cell sequenced [
      • See P.
      • Lum J.
      • Chen J.
      • et al.
      A Single-Cell Sequencing Guide for Immunologists.
      ]. The 10X Genomics Single Cell 3’ kit provides UMI-barcoded cDNA from single-cell suspensions using proprietary gel bead-in-emulsion (GEM) technology. In this system, uniquely barcoded gel beads containing reverse transcription reagents are mixed with a limiting dilution of cells. Following cell lysis, reverse transcription generates uniquely labeled cDNA from each cell's polyA tailed mRNA, which is then amplified and carried forward for subsequent processing []. Although most scRNA-seq workflows such as the 10X Genomics protocol generate 3’-based libraries via polyA-tail pooling, other methods enabling full-length scRNA-seq libraries have been recently developed and automated . In this study, Mamanova et al. describe two automated methods lasting 3 to 5 days from start to finish, with a 3-4 hour library preparation process [
      • Mamanova L
      • Miao Z.
      • Jinat A.
      • et al.
      High-Throughput Full-Length Single-Cell RNA-seq Automation.
      ].
      One of the challenges that exists in scRNA-seq workflows is the generation of sequencing libraries following single-cell cDNA generation. Once GEMs are generated and quality check is performed on the cDNA output, sequencing library preparation is laborious and time intensive. It requires precise timing and pipetting proficiency by the operator and can take up to 5 hours depending on sample count. As the number of samples increases, these challenges become difficult to overcome through manual processing. To address these issues, we sought to automate NGS library preparation for scRNA-seq applications. Here we develop and validate a walk-away, automated method that increases throughput, allowing processing of up to 48 samples per run, and decreases the amount of required hands-on time by more than 75% (from 4 hr to 45 min). Single-cell-derived cDNA was used as input to create NGS libraries, either manually or using the automated method, and the resulting libraries were compared following Illumina sequencing using standard NGS and single-cell quality control metrics.

      Materials and methods

      Single-cell suspension generation

      Murine bone marrow-derived cells were isolated from C57BL/6J animals (femurs and tibias), as previously described [
      • Koss C.K.
      • Wohnhaas C.T.
      • Baker J.R.
      • et al.
      IL36 is a Critical Upstream Amplifier of Neutrophilic Lung Inflammation in Mice.
      ]. Cells were resuspended, expanded and differentiated in macrophage media with MCSF (Macrophage colony-stimulating factor) for 7 days in an incubator at 37 °C with 5% CO2, in the presence or absence of IL4, IL13 and TNFα. Bone marrow-derived macrophages (BMDM) cultures were then detached and evaluated for viability, aggregation, and cell concentration using the NucleoCounter NC-3000 (ChemoMetec, Allerod, Denmark). In this assay, cell nuclei are stained using a fluorescent dye. Cell clumps and aggregates are identified via an image analysis algorithm running on the NucleoCounter system, which segments single nuclei within each smaller aggregate. For scRNA-seq experiments, we typically only use samples with a cell aggregation rate lower than 5%.

      Single-cell RNA sequencing

      Single-cell libraries were generated using the Single Cell 3′ Reagent Kit v3.1 chemistry (10X Genomics, Pleasanton, CA). Briefly, ∼8,700 cells per sample were loaded onto a G Chip in order to capture a final 5,000 cells. Doublet rates of 0.98% to 7.3% were observed (Supplementary Table 1). GEM generation on the Chromium Controller and reverse transcription yielded an average cDNA output of 623 ng across all samples. An equal cDNA input of 100 ng was used to generate the gene expression libraries, either by manual or Biomek i7 Hybrid automated method (Beckman Coulter Life Sciences, Indianapolis, IN) using 12 and 13 PCR cycles, respectively. Both methods followed the manufacturer's instructions, except for an additional bead cleanup performed prior to the final library elution to remove any residual adapter and primer dimer. A protocol detailing the Biomek loading procedures, the robotic program (Biomek Method File - bmf format), and the corresponding robotic step-by-step method have been made available and can be found in the Supplementary Material. Libraries from the Biomek and manual handling were quantitatively and qualitatively assessed using a Qubit 4 Fluorometer (Thermo Fisher, Waltham, MA) and a Fragment Analyzer (Agilent, Santa Clara, CA), respectively. Manually generated and Biomek-automated gene expression libraries yielded 2.4 and 3.1 pmol in yield on average and 457 bp and 466 bp in length, respectively, with no detectable adapter dimer. Single-cell libraries were normalized, pooled, spiked with 1% PhiX Control v3 (Illumina, San Diego, CA) and sequenced on an Illumina NextSeq500 instrument with single index, paired end reads (Read parameters: Rd1: 28, Rd2: 8, Rd3: 91). Sequencing of the manually prepared libraries was carried out twice to assess run-to-run sequencing reproducibility and variance. These are referred to as Manual #1 and Manual #2 in the Results section.

      Bioinformatics analysis

      Initial pre-processing of single-cell data, from raw intensity to fastq was performed using CellRanger v3.0.2 (https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/installation). StarSolo v2.7.1a (https://github.com/alexdobin/STAR/blob/master/docs/STARsolo.md) was used to align reads to the mouse reference genome (GRCm38.86) and featureCount, implemented in subread R package (v1.6.4), was used to generate read count matrices individually for each sample [
      • Dobin A.
      • Davis C.A.
      • Schlesinger F.
      • et al.
      STAR: Ultrafast Universal RNA-seq Aligner.
      ]. Sequencing saturation was calculated using the count function of the CellRanger pipeline and is a measure of the library complexity and sequencing depth. It represents the fraction of reads that are non-unique (i.e. duplicate of an already existing cell barcode-UMI-gene combination): n_duped_reads divided by n_reads, where n_duped_reads represents the number of mapped reads with the same cell barcode, UMI and gene; n_reads represents the total number of mapped reads. Cell multiplets were computationally identified using Single-Cell Remover of Doublets (Scrublet) [
      • Wolock S.L.
      • Lopez R.
      • Klein A.M.Scrublet
      Computational Identification of Cell Doublets in Single-Cell Transcriptomics Data.
      ]. After removal of the empty droplets, the following parameters were used to filter cells using Seurat 3:  fewer than 300 genes, more than 5,000 genes, or mitochondrial gene content greater than 0.15 [
      • Stuart T.
      • Butler A.
      • Hoffman P.
      • et al.
      Comprehensive Integration of Single-Cell Data.
      ]. These outlier cells were removed from further downstream analyses. Subsequently, quality control, log normalization of total UMI counts per cell, identification of highly variable genes (HVGs), dimensionality reduction using PCA and UMAP and sample comparison figure was generated using modules implemented in Scanpy (v1.4.5) [
      • Wolf F.A.
      • Angerer P.
      • Theis F.J.
      SCANPY: Large-Scale Single-Cell Gene Expression Data Analysis.
      ]. Specifically, all data were integrated into a single dataset and individual gene expression was normalized to a total 10k UMIs per cell. The log (normalized expression + 1) was used to identify HVG and to compute PCA. The corresponding Pearson correlations are calculated between the top 50 PCA components based on the identified HVG.

      Results

      Single-cell suspension led to high quality cDNA, adequate for subsequent library generation

      Single-cell suspensions from murine bone marrow-derived macrophages were assessed for quality and yield in order to determine their suitability moving forward with GEM generation and cDNA synthesis on the Chromium Controller microfluidics platform (Figure 1). The resulting quality check based on fluorescently labelled Annexin V measurements showed that all samples had high viability and sufficient cell count with low aggregate rate (Figure 2 A-C). Our standard exclusion criteria were >90% viable cells and <5% aggregate cells. These exclusion criteria were selected in order to guarantee a low ambient RNA content and a reduced doublet rate downstream, respectively. Single-cell suspensions were normalized and loaded on the Chromium Controller, targeting an average ∼5,000 captured cells per sample. Following GEM generation and cDNA amplification, the output was assessed for quantity and quality. The average cDNA yield was 623 ng (Figure 2D), an output sufficient for subsequent library processing, both manually and via Biomek automation. The corresponding cDNA profiles, as shown by the collated Fragment Analyzer traces (Figure 2E), displayed typical high molecular weight peaks (∼800-1500 bp), which was in the expected range. The cDNA output was then normalized and aliquoted for manual and automated library preparation (100 ng each).
      Fig. 1
      Fig. 1Experimental Design. Single-cell suspensions were assessed quantitatively and qualitatively using the NucleoCounter system prior to GEM generation in the Chromium Controller device. Subsequently, cDNA amplification and cleanup are performed followed by a quality check. After cDNA normalization, each sample was prepared for manual or automated (Biomek i7 Hybrid workstation) library preparation. Sequencing libraries were then assessed quantitatively and qualitatively before normalization, pooling and sequencing. Demultiplexing and data processing were performed, and side-by-side comparison of key single-cell metrics was done.
      Fig. 2
      Fig. 2Cell and cDNA synthesis QC. Cell count (A), viability (B) and aggregate count (C) were assessed for all six single-cell suspensions. Our standard thresholds of >90% viable cells and <5% aggregate were applied. After GEM and cDNA synthesis, quantitative and qualitative measurements were performed to assess cDNA yield (D) and Fragment Analyzer trace (E), respectively. All samples passed QC with a cDNA yield higher than 100 ng.

      Side-by-side comparison of gene expression libraries for scRNA-seq using a manual or an automated approach shows equivalent performance

      Massively parallel single-cell RNA sequencing allows scientists to dissect complex biological systems at high resolution and genomic-scale granularity. In practice, this is achieved via single cell partitioning of thousands of cells using a microfluidics platform, the 10X Chromium Controller. Within this device, each individual cell is encapsulated into a nanoliter-scale oil droplet called Gel Bead-in-Emulsion (GEM). In each functional GEM, the single cell is lysed locally and in isolation of the others via oil compartmentalization. Reverse transcription of the released poly-adenylated mRNA molecules occurs via polydT primer capture sequences coating the Gel Bead surface. Besides a ubiquitous polydT capture sequence, this oligonucleotide also contains a universal sequencing primer which is the same for all GEMs, and two barcodes: 1) a cell barcode which is the same within a single GEM and is used to index the encapsulated cell, thereby tagging the entire cell's transcriptome and 2) a unique molecular identifier (UMI), which is different for each oligo coating the Gel Bead and is used to index one single mRNA molecule. As a result, within each individual GEM, all synthesized cDNA molecules will be simultaneously tagged by the same cell barcode and by unique tags, allowing the sequencing reads to be mapped back computationally to their respective cell and mRNA molecule of origin (Supplementary Figure 1) [].
      The Biomek i7 Hybrid Automated Workstation is an automated liquid handler that performs complex liquid handling steps such as those required in the 10X Genomics single-cell transcriptome library workflow. Our Biomek instrument was equipped with an 8-channel Span-8 pod and a 96-well Multichannel head that could accurately pipette from 1 to 1000 µL and 5 to 1200 µL, respectively. It had 45 deck positions which included an orbital shaker, heating/cooling Peltiers, a tip-washing station, a thermal cycler and two grippers for the movement of plates on the deck. Here, the method we developed for the processing of the 10X Chromium 3’ gene expression libraries was a fully walk-away procedure that needed no user intervention, thereby completely freeing the laboratory operator. Either by automation or manual handling, 100 ng of cDNA was subjected to a fragmentation step, followed by end repair, A-tailing, and a double-sided, bead-based size selection. This was followed by adapter ligation, a bead cleanup, and the final PCR for sample indexing. The resulting double-stranded DNA product was then cleaned up using double-sided, size selection and an additional 1:1 bead cleanup to remove any residual adapter and primer dimer.
      In order to validate the newly developed Biomek i7 method, we directly compared the scRNA-seq automated libraries to their manually processed, sample-matched counterparts. We found that samples from automated and manual preparations were virtually identical in fragment size (Figure 3A). The average peak for manual libraries was 457 bp, and 466 bp for libraries made using the Biomek method with a 1.7% CV in overall size distribution. Importantly, this analysis also showed that there was no adapter or primer dimer present in the 0-100 bp range (Figure 3C). Aside from quality, another parameter that was assessed was library quantity. The average total yield for the manual libraries was 2.4 pmol, as compared to 3.1 pmol for the automated libraries (Figure 3B). This small difference may be partially explained by the additional PCR cycle performed by the on-deck thermal cycler fitted on the Biomek i7, as compared to off-deck cycler (13 vs 12 PCR cycles). Indeed, while designing this automated method, we wanted to account for a possible lower performance of the on-deck thermal cycler as compared to a traditional off-deck cycler used during manual library handling. We also wanted to consider a potential loss of material over the whole automation process. In our hands, this can occur in some automated approaches, as during cleanup steps a slightly larger volume must be left behind to ensure higher pipetting margin of error. Regardless of the library generation method, the final amounts of double stranded libraries were within the expected concentration range. Importantly, the yields were more than enough to move forward with sequencing.
      Fig. 3
      Fig. 3Library QC. Yield (A) and fragment size (B) were assessed for the manually prepared libraries and their matching Biomek-automated counterparts. Fragment Analyzer traces (C) of both sets demonstrate the qualitative similarity in library profile.

      Sequencing performance across automated and manual libraries shows high correlation

      Each manually generated scRNA-seq library was sequenced twice (Manual #1 and Manual #2) (Figure 4A-C). This sequencing strategy was selected because it would allow the determination of the typical run-to-run sequencing variability of the same library. This amount of variability was then used to compare the sequencing results from the manual libraries to those of the Biomek-generated libraries. Therefore, three sequencing rounds were performed on each sample: two back-to-back sequencing runs on the manually handled libraries, and a sequencing run using the Biomek-generated libraries. After sample demultiplexing and processing (see Methods), the primary sequencing results for the two sequencing runs of the same manual libraries were reviewed (Figure 4A-C). These primary sequencing metrics showed high correlation across all six samples: sequencing saturation (R²=0.996), fraction of outlier cells (R²=0.971) and mitochondrial RNA fraction (R²=0.982). This indicated that sequencing the same library twice produced highly reproducible primary results in terms of sequencing depth, detection of cells to be excluded from the downstream analysis, and amount of residual, leaky ambient mitochondrial RNA present in each sample.
      Fig. 4
      Fig. 4Single-Cell Transcriptomics Sequencing Metrics. After GEM generation on the 10X Chromium Controller, cDNA was generated for all six samples. Gene expression libraries were then prepared twice using 100 ng of cDNA as input: once manually, and once by automation. The set of manually generated libraries were subsequently sequenced twice back-to-back (A-C) and compared to the Biomek automated method (D-F). Sequencing saturation (A, D) was used to estimate library complexity. The outlier fraction (B, E) represents the low-quality cell fraction to be excluded from downstream analysis. The median mitochondrial RNA per cell (C, F) was used to estimate ambient RNA contamination due to apoptotic or dying cells.
      When comparing the two manual sequencing runs to that of the Biomek libraries, we again observed a high degree of correlation with average R² values of 0.961, 0.964 and 0.965 for the sequencing saturation, fraction outlier and mtRNA content, respectively. Analysis of the individual runs (Figure 4D-F) showed only slight variability in the correlations between Biomek vs Manual #1 and Biomek vs Manual #2. In some cases, the values observed were equal or even higher than the manual run correlations (Manual #1 vs Manual #2). For example, we calculated a slightly higher correlation between Biomek and Manual #1 (R²=0.973) than Manual #1 vs #2 (R²=0.971) for the outlier fraction parameter (Figure 4 B, E). Similarly, for the mtRNA content, we observed almost the same correlation between Manual #1 vs #2 (R²=0.982), and between Biomek vs Manual #2 (R²=0.980) (Figure 4 C, F).
      Moving further downstream in the analysis pipeline, we documented the number of single cells detected per sample in all three sequencing rounds. When comparing the two manual runs against each other, this correlation was the lowest of all quality assessment metrics (R²=0.911) (Figure 5A). This indicated that this metric was the most likely to have variability from one library preparation to another. Similarly, the cell count correlation for Biomek vs Manual was an average of R²=0.798 across the two runs, also the lowest of all surveyed metrics (Figure 5E). The assessment of how many cells are present in any given sample (i.e. 10X barcode counting) may benefit from a higher sequencing depth, which would allow for a wider survey of the library molecular diversity. As for the median number of genes detected per cell (Figure 5B, F), we saw highly similar correlations, with R²=0.938 as an average for the Manual vs Biomek correlation, and R²=0.961 for the Manual #1 vs 2. Here, we even surpassed the correlation observed when re-sequencing the same manual library for one of the individual manual runs (Biomek vs Manual #1) with R²=0.963 (Figure 5F), a testament to how similar these libraries truly are. Likewise, the total UMI counts per cell (Figure 5C, G) and the fraction of reads with valid UMIs (Figure 5D, H) both showed similarly high correlation values. It is worth noting that the deeper the sequencing, the more molecular diversity and library complexity is captured, and the higher the UMI and gene counts typically are – up to a certain point where these parameters will reach a plateau. We noticed that Manual #1 had marginally lower sequencing saturation than Manual #2 and Biomek (Fig. 4 A,D). As a result of lower sequencing depth, we observe slightly lower cells detected, genes and UMI counts per cell for Manual #1. Inversely, Biomek-generated libraries tend to be slightly higher for these parameters, while the Manual #2 libraries sit somewhere in between these two datasets. This is likely a function of slight differences in sequencing depth due to inherent run-to-run sequencing variations coming from library pooling and balancing of each dataset. Of note, an average 3.45% of cells across all samples were computationally identified as doublets, which is less than the expected 5% doublet rate (Supplementary Table 1). This concluded our quality assessment metrics comparison, which demonstrates how scRNA-seq sequencing metrics are highly correlated across the automated and the manual libraries.
      Fig. 5
      Fig. 5Single-Cell Transcriptomics Molecular Performance. From the original cDNA stock, sequencing libraries were prepared manually and sequenced twice (A-D), before being compared to the Biomek automated method (E-H). Parameters analyzed for each sequencing run included the number of cells per sample (A,E), the median number of genes detected per cell (B, F), and the total UMI counts per cell (C,G). The fraction of valid UMIs (D,H) represents the fraction of reads with UMIs that 1) are not a homopolymer sequence and 2) do not contains any Ns. A low (<70%) valid UMI fraction may be indicative of sequencing or library issues.

      Single-cell cluster analysis confirms near-perfect match of libraries generated manually or on the Biomek i7 instrument

      Macrophages are central to tissue homeostasis and innate immunity [
      • Pollard J.W.
      Trophic Macrophages in Development and Disease.
      ]. They make up most of the immune cells in the tumor microenvironment and their polarization into two functional phenotypes, broadly categorized as pro- and anti-tumor, helps orchestrate cancer-related inflammation [
      • Sica A.
      Role of Tumour-Associated Macrophages in Cancer-Related Inflammation.
      ,
      • Parisi L.
      • Gini E.
      • Baci D.
      • et al.
      Macrophage Polarization in Chronic Inflammatory Diseases: Killers or Builders?.
      ],. Recently, macrophage polarization was shown to be a dynamic continuum, and intrinsically dependent on the balance between TNF and IL-13 [
      • Kratochvill F.
      • Neale G.
      • Haverkamp J.M.
      • et al.
      TNF Counterbalances the Emergence of M2 Tumor Macrophages.
      ]. Here, macrophages were differentiated from mouse bone marrow in MCSF to generate quiescent, unactivated macrophages which were then exposed to diametrically opposing cytokine cocktails: either anti-inflammatory, pro-tumorigenic phenotype with IL4 and IL13 or pro-inflammatory, anti-tumor phenotype with TNF. We hypothesized that these cytokine stimulations would polarize the macrophages toward two distinct transcriptional responses which could be subsequently identified by scRNA-seq. We further investigated the gene expression profiles of all 74,628 macrophages on a per-cell basis. Cluster analysis revealed 2 main clusters that were divided based on the above-mentioned treatments of macrophage after differentiation (Figure 6). These two clearly separated clusters obtained in the UMAP reflect the in vitro generated macrophage polarization and hence can help resolve clear-cut macrophage transcriptional phenotypes. The data therefore show that it is possible to segregate macrophage polarization into distinct and non-overlapping transcriptional phenotypes which now opens up the possibility to identify unique markers for each population. Having a deeper understanding of specific macrophage population in vitro and in vivo is of particular importance to and a prerequisite for the development of drugs.
      Fig. 6
      Fig. 6UMAP Representation of Manual and Automated Datasets. The UMAP dimensionality reduction method was used to visualize the data, highlighting sample-specific cells in red, while the remainder of the cohort is represented in grey (n=6). The Pearson correlation values are reported.
      When comparing manual and automated libraries, no visual distinction could be found between sample-matched libraries, with individual cells and sub-clusters virtually indistinguishable from one another (Figure 6). We then compared the gene expression profiles of scRNA-seq libraries generated via manual handling (Manual #1 vs #2) and found a very high correlation of R=0.973 (Pearson correlation) across the individual cell transcriptomes. This correlation refers to the technical variance inherent to any new sequencing run. Amazingly, when comparing transcript levels across the manual and automated libraries, we found a correlation of R=0.971, almost identical to the above-mentioned technical variance. With such high degrees of similarity, we could therefore extrapolate that any minor differences observed in the transcriptome of individual cells can be attributed to the run-to-run sequencing reproducibility error, rather than the handling of the libraries themselves. In other words, generating a scRNA-seq library using the Biomek or manually is equivalent to re-sequencing that very same library.

      Discussion

      Technological advances within the last two decades have led to the ability for researchers to examine transcriptomics on the single-cell level [
      • Chiang M.K.
      • Melton D.A.
      Sincle-Cell Transcript Analysis of Pancreas Development.
      ]. As the number of cells that can be sequenced in a single experiment increases, the sequencing cost per cell decreases, and this has led to several new, commercially available single-cell assay kits that are high-throughput compatible. Each of these has its own strengths and weaknesses, so the workflow selection should be based primarily on the biological system being studied. The first step of these workflows generally involves the isolation of single cells, which can be accomplished using microfluidic droplet-based systems, such as the Chromium (10X Genomics), ddSEQ (BioRad), and Drop-Seq, or using a nanowell microchip system, such as the ICELL8 (Takara Bio). Recent work by Yamawaki and coworkers compared the ability of these systems to generate scRNA-seq libraries from a heterogenous mixture of immune cells. This work found that the Chromium workflow captured significantly more cells than the other cell isolation methods tested, which is an important factor when working with precious samples [
      • Yamawaki T.M.
      • Lu D.R.
      • Ellwanger D.C.
      • et al.
      Systematic Comparison of High-Throughput Single-Cell RNA-seq Methods for Immune Cell Profiling.
      ]. Further compared to the other systems mentioned above, the 10X workflow was the most sensitive for mRNA detection, as it produced the most complex libraries with the highest gene counts observed in the study . A similar study compared thirteen different scRNA-seq workflows and benchmarked them according to factors such as gene detection, marker expression, and the ability to identify and cluster different cell types. Again, the Chromium system performed well, only behind Quartz-seq2 [
      • Mereu E.
      • Lafzi A.
      • Moutinho C.
      • et al.
      Benchmarking Single-Cell RNA-sequencing Protocols for Cell Atlas Projects.
      ]. Notably, Quartz-seq2 is a plate-based system that uses a cell sorter to isolate single cells [
      • Sasagawa Y.
      • Danno H.
      • Takada H.
      • et al.
      Quartz-Seq2: A High-Throughput Single-Cell RNA-sequencing Method that Effectively uses Limited Sequence Reads.
      ]. The microfluidic Chromium system can partition and capture more cells per experiment in a faster manner, so when an experiment requires analysis of a large number of cells, the Chromium system may be preferred.
      Single-cell RNA-seq has continuously improved in terms of cost, quality, speed and yield since its inception in 2010, when libraries were generated from 16 cells within six days [
      • Tang F.
      • Barbacioru C.
      • Nordman E.
      • et al.
      RNA-Seq Analysis to Capture the Transcriptome Landscape of a Single Cell.
      ]. A decade later, genomics laboratories are routinely processing 5,000 cells per sample and generating libraries within a couple days, including sequencing. In terms of manual handling and time, the bottleneck for scRNA-seq protocols remains the single-cell suspension generation. Tissue dissociation itself may require significant optimization depending on the tissue source, with various dissociation methods available (e.g., enzymatic or mechanical). Even for tissues originating from a single source, we have seen the necessity for optimizing dissociation based on the disease status (e.g., healthy vs fibrotic lung). At this stage and up until single-cell droplet encapsulation, also called GEM generation in the 10X Genomics workflow, we often see the need for handling samples on an individual basis. This work often requires both laboratory operator intervention and user experience in terms of sample inclusion or exclusion based on quality control factors important to scRNA-seq quality, such as cell viability and aggregation rate. As there are so many variables in this process, we suggest there is very little to gain from automating this upstream part of the scRNA-seq workflow. However, when considering the post-GEM generation portion of the workflow, automation becomes an attractive option after cDNA is generated from individual samples, normalized, and stored. At this stage, there is no longer any user intervention needed and processing is identical across all samples, regardless of cell type, tissue source, or disease status. Once cDNA is produced, one could even combine different projects together and generate libraries in a single batch. With this in mind, we focused our efforts on developing and validating a fully walk-away method capable of generating up to 48 scRNA-seq libraries in 3 hours, with end-to-end robotic handling, requiring only 45 minutes of setup time from the laboratory operator.
      We first demonstrated that the sequencing of the same scRNA-seq library had a small but detectable run-to-run variability, which we establish to be R=0.973 in our experimental set-up. This was determined by sequencing the same library twice and comparing the transcriptomes of each individual cell. Depending on the cell type or cellular state, there are between 105 and 106 mRNA molecules present in any given cell, with more than 10,000 expressed genes [
      • Hwang B.
      • Lee J.H.
      • Bang D.
      Single-cell RNA Sequencing Technologies and Bioinformatics Pipelines.
      ]. The molecular diversity of a scRNA-seq library is therefore very high, not only due to transcript diversity, but also to the Unique Molecular Identifiers (UMIs) incorporated into the cDNA, sample indexes needed for sample demultiplexing, and cell barcodes required for cell demultiplexing. One way to mitigate such run-to-run sequencing variability is to increase sequencing depth to reach saturation. Such saturation may be achieved at different sequencing depths, which is dependent on the samples’ mRNA molecular diversity.
      Knowing that the variability inherent in the sequencing workflow led to R=0.973, we then looked at the correlation between manual and automated processing. We find the results obtained using the automated method to be virtually identical to the sequencing variability previously observed, as R=0.971. Interestingly, one could extrapolate that, should there be sequencing variability as well as significant library differences due to the automated processing, we would have seen a correlation much lower than the sequencing variability alone. Taken together, this shows that automated scRNA-seq library generation using the Biomek is indistinguishable from manually prepared libraries.
      As sequencing costs have significantly decreased, scRNA-seq experiments have scaled up exponentially and become routine [
      • Svensson V.
      • Vento-Tormo R.
      • Teichmann S.A.
      Exponential Scaling of Single-Cell RNA-seq in the Past Decade.
      ]. This scaling up may be further aided by liquid-handling robotics such as the Biomek i7 Workstation platform. Additionally, tagging and multiplexing of samples, also known as cell hashing, occurring prior to droplet encapsulation, can also significantly reduce reagent volumes and consumable costs. This can be done either via antibody- or lipid-based multiplexing methods for both single nuclei and single cells [
      • Mylka V.
      • Aerts J.
      • Matetovici I.
      • et al.
      Comparative Analysis of Antibody- and Lipid-Based Multiplexing Methods for Single-Cell RNA-seq.
      ,
      • Stoeckius M.
      • Hafemeister C.
      • Stephenson W.
      • et al.
      Simultaneous Epitope and Transcriptome Measurement in Single Cells.
      . The new 10X Genomics lipid-based multiplexing method for scRNA-seq promises to pool up to 12 samples, in a species-agnostic manner (https://support.10xgenomics.com/permalink/3RWaZm0kJBmtSjB2UGRieV). Notably, the library processing automated in the present study is compatible with the newly released Cell Multiplexing 10X kit. Ultimately, the use of walk-away automated methods can help reduce users’ hands-on time so that they can attend to other important laboratory tasks.
      It should be noted that the present study establishes high correlation between the scRNA-seq manual and automated library processes for macrophages undergoing cytokine-induced polarization. Other cell types, such as small airway epithelial cells (SAECs), were also tested for reproducibility and an even higher correlation was found (R=0.992, n=8 – data not shown). However, further investigation would be needed to establish the exact repeatability of the automated method itself and whether it is higher than that of the manual handling. Furthermore, the 10X single-cell method was written on the Biomek i-Series Software version BM5.1.10 and designed for a Biomek i7 Hybrid instrument. Based on hardware requirements of the method, other Biomek instruments with the deck space to accommodate the automated thermal cycler, an orbital shaker, two Peltiers, and a total of 20 individual deck positions (Supplementary Figure 2), would be capable of performing workflows similar to the automated method described here. Therefore, the method can be adapted to other Biomek i-Series models, such as the i7 Dual Multichannel or the i7 Dual Span-8, as well as smaller instruments, such as the i5 Multichannel or i5 Span-8. In theory, an analogous method could also be adapted on the Biomek 4 software for older Biomek systems (FXP) equipped with the required devices and integrations.
      In summary, we have shown automation technologies yield the same high-quality results for single-cell gene expression libraries compared to manual processing, while significantly reducing hands-on time. Further large-scale studies are needed to ensure reproducibility and stability across multiple automated scRNA-seq library generation runs. We expect that in the near future, many labs will have the tools and ability to integrate similar systems and generate scRNA-seq data at scale.

      Disclaimer

      Biomek Automated Workstations are not intended or validated for use in the diagnosis of disease or other conditions. This protocol is for demonstration only and is not validated by Beckman Coulter. Beckman Coulter makes no warranties of any kind whatsoever express or implied, with respect to this protocol, including but not limited to warranties of fitness for a particular purpose or merchantability or that the protocol is non-infringing. All warranties are expressly disclaimed. Your use of the method is solely at your own risk, without recourse to Beckman Coulter.
      ©2021 Beckman Coulter, Inc. and Boehringer Ingelheim. All rights reserved. Beckman Coulter, the Stylized Logo, and Beckman Coulter product and service marks mentioned herein, including Biomek, are trademarks or registered trademarks of Beckman Coulter, Inc. in the United States and other countries. Chromium is a trademark of 10X Genomics.  All other trademarks are the property of their respective owners.

      Appendix. Supplementary materials

      References

        • Chen G.
        • Ning B.
        • Shi T.
        Single-Cell RNA-Seq Technologies and Related Computational Data Analysis.
        Front Genet. 2019; 10: 317
        • See P.
        • Lum J.
        • Chen J.
        • et al.
        A Single-Cell Sequencing Guide for Immunologists.
        Front Immunol. 2018; 9: 2425
      1. 10X Genomics. User Guide: Chromium Next GEM Single Cell 3’ Reagent Kits v3.1. 2019 https://support.10xgenomics.com/single-cell-gene-expression/library-prep/doc/user-guide-chromium-single-cell-3-reagent-kits-user-guide-v31-chemistry.

        • Mamanova L
        • Miao Z.
        • Jinat A.
        • et al.
        High-Throughput Full-Length Single-Cell RNA-seq Automation.
        Nat Protoc. 2021; 6: 2886-2915
        • Koss C.K.
        • Wohnhaas C.T.
        • Baker J.R.
        • et al.
        IL36 is a Critical Upstream Amplifier of Neutrophilic Lung Inflammation in Mice.
        Commun Biol. 2021; 4: 172
        • Dobin A.
        • Davis C.A.
        • Schlesinger F.
        • et al.
        STAR: Ultrafast Universal RNA-seq Aligner.
        Bioinformatics. 2013; 29: 15-21
        • Wolock S.L.
        • Lopez R.
        • Klein A.M.Scrublet
        Computational Identification of Cell Doublets in Single-Cell Transcriptomics Data.
        Cell Syst. 2019; 8: 281-291
        • Stuart T.
        • Butler A.
        • Hoffman P.
        • et al.
        Comprehensive Integration of Single-Cell Data.
        Cell. 2019; 177: 1888-1902
        • Wolf F.A.
        • Angerer P.
        • Theis F.J.
        SCANPY: Large-Scale Single-Cell Gene Expression Data Analysis.
        Genome Biol. 2018; 19: 15
        • Pollard J.W.
        Trophic Macrophages in Development and Disease.
        Nat Rev Immunol. 2009; 9: 259-270
        • Sica A.
        Role of Tumour-Associated Macrophages in Cancer-Related Inflammation.
        Exp Oncol. 2010; 32: 153-158
        • Parisi L.
        • Gini E.
        • Baci D.
        • et al.
        Macrophage Polarization in Chronic Inflammatory Diseases: Killers or Builders?.
        J Immunol Res. 2018; 8917804
        • Kratochvill F.
        • Neale G.
        • Haverkamp J.M.
        • et al.
        TNF Counterbalances the Emergence of M2 Tumor Macrophages.
        Cell Rep. 2015; 12: 1902-1914
        • Chiang M.K.
        • Melton D.A.
        Sincle-Cell Transcript Analysis of Pancreas Development.
        Dev Cell. 2003; 4: 383-393
        • Yamawaki T.M.
        • Lu D.R.
        • Ellwanger D.C.
        • et al.
        Systematic Comparison of High-Throughput Single-Cell RNA-seq Methods for Immune Cell Profiling.
        BMC Genomics. 2021; 22: 66
        • Mereu E.
        • Lafzi A.
        • Moutinho C.
        • et al.
        Benchmarking Single-Cell RNA-sequencing Protocols for Cell Atlas Projects.
        Nat Biotechnol. 2020; 38: 747-755
        • Sasagawa Y.
        • Danno H.
        • Takada H.
        • et al.
        Quartz-Seq2: A High-Throughput Single-Cell RNA-sequencing Method that Effectively uses Limited Sequence Reads.
        Genome Biol. 2018; 19: 29
        • Tang F.
        • Barbacioru C.
        • Nordman E.
        • et al.
        RNA-Seq Analysis to Capture the Transcriptome Landscape of a Single Cell.
        Nat Protoc. 2010; 5: 516-535
        • Hwang B.
        • Lee J.H.
        • Bang D.
        Single-cell RNA Sequencing Technologies and Bioinformatics Pipelines.
        Exp Mol Med. 2018; 50: 1-14
        • Svensson V.
        • Vento-Tormo R.
        • Teichmann S.A.
        Exponential Scaling of Single-Cell RNA-seq in the Past Decade.
        Nat Protoc. 2018; 13: 599-604
        • Mylka V.
        • Aerts J.
        • Matetovici I.
        • et al.
        Comparative Analysis of Antibody- and Lipid-Based Multiplexing Methods for Single-Cell RNA-seq.
        bioRxiv. 2017;
        • Stoeckius M.
        • Hafemeister C.
        • Stephenson W.
        • et al.
        Simultaneous Epitope and Transcriptome Measurement in Single Cells.
        Nat. Methods. 2017; 14: 865-868