High-throughput automated microfluidic sample

Full text

Turn on search term navigation

ARTICLE

Received 14 Oct 2016 | Accepted 11 Nov 2016 | Published 27 Jan 2017

Soohong Kim1,2, Joachim De Jonghe1,3, Anthony B. Kulesa1,2, David Feldman1,4, Tommi Vatanen1,5, Roby P. Bhattacharyya1,6, Brittany Berdy7, James Gomez1, Jill Nolan1, Slava Epstein7 & Paul C. Blainey1,2

Low-cost shotgun DNA sequencing is transforming the microbial sciences. Sequencing instruments are so effective that sample preparation is now the key limiting factor. Here, we introduce a microuidic sample preparation platform that integrates the key steps in cells to sequence library sample preparation for up to 96 samples and reduces DNA input requirements 100-fold while maintaining or improving data quality. The general-purpose microarchitecture we demonstrate supports workows with arbitrary numbers of reaction and clean-up or capture steps. By reducing the sample quantity requirements, we enabled low-input (B10,000 cells) whole-genome shotgun (WGS) sequencing of Mycobacterium tuberculosis and soil micro-colonies with superior results. We also leveraged the enhanced throughput to sequence B400 clinical Pseudomonas aeruginosa libraries and demonstrate excellent single-nucleotide polymorphism detection performance that explained phenotypically observed antibiotic resistance. Fully-integrated lab-on-chip sample preparation overcomes technical barriers to enable broader deployment of genomics across many basic research and translational applications.

1 The Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA. 2 Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA. 3 Department of Biochemistry, University of Cambridge, Cambridge CB2 1GA, UK. 4 Department of Physics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA. 5 Department of Computer Science, Aalto University School of Science, Espoo 02150, Finland. 6 Division of Infectious Diseases, Massachusetts General Hospital, Boston, Massachusetts 02114, USA. 7 Department of Biology, Northeastern University, Boston, Massachusetts 02115, USA. Correspondence and requests for materials should be addressed to P.C.B. (email: mailto:[email protected]

Web End [email protected] ).

NATURE COMMUNICATIONS | 8:13919 | DOI: 10.1038/ncomms13919 | http://www.nature.com/naturecommunications

Web End =www.nature.com/naturecommunications 1

DOI: 10.1038/ncomms13919 OPEN

High-throughput automated microuidic sample preparation for accurate microbial genomics

ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms13919

Low-cost DNA sequence data generation is enabling the widespread application of genomic methods across the microbial sciences. Genome sequencing can comprehen

sively survey commensal microbiota1, enable the diagnosis of drug resistant infections26 and reveal networks through which infections are transmitted7. In particular, pathogen surveillance by whole-genome shotgun (WGS) analysis provides information for molecular epidemiology of critical value to public health7,8 that cannot be obtained by culture or PCR. To this point, a recent Executive Order9 called for nationwide tracking of antibiotic resistance in microbial pathogens by genome sequencing in the US. In addition, natural products produced by microbes continue to serve as a rich source of therapeutic compounds spanning antibiotics to cancer10. Such compounds can be discovered by performing large-scale sequencing of environmental samples11.

Despite impressive progress in technology for sequence data production, the methods used to prepare sequencing samples lag behind (Supplementary Fig. 1). To sequence bacterial genomes, cells must be lysed and their DNA puried, fragmented, tagged with adaptors and size-selected before loading on a sequencing instrument. The complex experimental logistics and labour currently required to complete these steps limit sample throughput. The introduction of liquid handling robotics and electrowetting-based digital microuidics have helped to increase throughput, but these workows require high DNA input, do not integrate all the key workow steps (variously omitting cell lysis, DNA fragmentation and size selection), and substantially offset reductions in reagent and labour costs with expensive proprietary equipment and consumables (Supplementary Fig. 2)12.

The performance of available sample preparation methods on low-quantity samples is limiting in many microbial applications, as microbes can be difcult to isolate and grow to the quantities required for sequencing. Data from samples that are expanded by extensive culture or biochemical amplication can be signicantly biased and are more likely to be contaminated1315. Available library construction techniques typically require inputs of 41 million cell equivalents and are susceptible to contamination (particularly at low-input levels)1618.

Whole-genome sequencing holds promise as a low-cost, rapid, essentially universal diagnostic for infectious disease1921. However, input quantity requirements present a serious barrier for rapid WGS-based diagnostics for slow-growing microbes like Mycobacterium tuberculosis. Applications in environmental microbiology and natural product discovery are even further restricted by high-input requirements since only about 1% of environmental microbial isolates are easily cultured to produce a large quantity of pure genomic material22. Innovations in environmental sampling and culturing such as the iChip22 enhance the chance of producing novel isolates, but also produce micro-colonies that are recalcitrant to scale-up culture and provide too little biomass for available direct WGS approaches.

Here we introduce a new poly dimethylsulfoxide (PDMS) microuidic circuit architecture that integrates all the major steps in sample preparation for the rst time (Supplementary Fig. 2; Supplementary Data 1) and makes major advances in input requirement and throughput while maintaining data quality. We demonstrated the new microarchitecture for sample preparation from clinical pathogen and environmental isolates, showing excellent data quality at extraordinarily low-input quantities. We expect this sample preparation platform will nd wide application in a variety of high sample throughput genomics applications25,2326.

ResultsMicrodevice design and operation. Microuidics is a natural solution to increase throughput and reduce input requirements

thanks to its scalable automation and capability to precisely manipulate small volumes27,28. Even small bench-top sequencing instruments like the MiSeq can sequence many microbial genomes per run, necessitating larger sample preparation batch sizes. Thus, we optimized a high-density two-layer microuidic27 sample-processing microarchitecture that enables a batch size up to 96 samples per device run (Fig. 1 and Supplementary Fig. 3). Integrating all the key WGS sample preparation steps in our automated microuidic device resulted in increased throughput, reduced reagent consumption, reduced input requirements, reduced contamination and improved reproducibility.

To prepare samples for genome sequencing, DNA must rst be extracted from cells and then processed into properly sized fragments with attached sequencing adaptors. We extract genomic DNA in the device by lysing cells with a combination of heat, detergents, and hydrolytic enzymes. To fragment and attach adaptors to the gDNA, we apply the tagmentation chemistry2326,29, which uses transposase enzymes to insert adaptor oligonucleotide sequences directly into high molecular weight DNA. For DNA purication, which is required at multiple points in the sample preparation process, we implement the solid phase reversible immobilization (SPRI)30 method in the microuidic system (Supplementary Note 1).

As we increased the reactor density to t 96 channels in a 70 mm 35 mm device, we adapted our design to prevent cross

contamination among samples and reagents, while still allowing for a single shared waste stream and individual collection of products.

Our microuidic circuit performs the DNA extraction and library construction steps in a rotary reactor31 of 36 nanoliters (nl) that metres and mixes reagents and operates in concert with lter valves to strain cells or beads from solution (Fig. 1a and Supplementary Fig. 4). Standard micro-valves partition the rotary reactor for dead-end-ll loading27 of precise quantities of each reagent. These same valves are used coordinately as peristaltic pumps to mix reaction components within the rotary reactors (Supplementary Fig. 3a). Filter valves are improved sieve valves27 that can be actuated to rapidly strain particles from solution (Supplementary Fig. 5). To concentrate cells or purify, concentrate, and size-select nucleic acid products, we capture beads using lter valves at the output of each reactor (Supplementary Fig. 3a). This purication approach requires no chemical modication of the device32 and simplies the reaction circuit since the lter valves can be directed to collect or release beads by ow in either direction, eliminating the need for additional bypass channels that reduce reactor density33. The micro-automated SPRI purications used in DNA extraction, reaction clean-up and size selection operations yield 480%

recovery of picogram-level starting material (Supplementary Fig. 3b).

To establish multiple manifolds for solution loading/unloading in the two-layer high-density format, we strategically placed waste port vias within each lter unit that connect to a sewer line in the bottom layer of the device33 (Supplementary Fig. 3a). This arrangement enables bead, sample and reagent wastes to escape the high-density reaction circuits in the device upper layer without cross contamination (Supplementary Fig. 6; Supplementary Note 2) and without the complexity of extra device layers or extra space needed for internal access ports34. We initially load reagents into each reactor through one manifold (red arrow, Fig. 1a), while cells or gDNA samples are loaded via individual input/output ports (black arrow, Fig. 1a).

Finally, the re-use of each rotary reactor for multiple process steps eliminates the need for a series of reactors matched to each workow step. Re-use is enabled by the capability for reaction/pull-down (for example, ltration or purication) steps.

2 NATURE COMMUNICATIONS | 8:13919 | DOI: 10.1038/ncomms13919 | http://www.nature.com/naturecommunications

Web End =www.nature.com/naturecommunications

NATURE COMMUNICATIONS | DOI: 10.1038/ncomms13919 ARTICLE

96 reactors

gDNA input: low-input from 50 pg (104 cell Eq.) Clinical P. aeruginosa DNA

0 14D1 14D2 14D5 16D1 16D2 16D5

1,200

100

Complexity (x)

384

Map rate (%)

Sample ID

Cell input: ultra low-input from 1000 E. coli cells

Sample ID

Complextiy (x)

100

Map rate (%)

140

0 1 36

Cell input: low-input ~104 M. tuberculosis cells

Sample ID

Complexity (x)

100

Map rate (%)

1,200

NTC

Wash

On-device integrated microfluidic workflows

iChip micro-colonies

(TB & E. coli)

Cell input: low-input iChip ~104 soil micro-colony cells

Human DNA

fraction, device

Genomic DNA

1,200

200

0 1,200

200

100

104

Library complexity

device (x)

Low contamination in device libraries

13/14 device libraries show >200x complexity

103

Cell input workflow: (i.e. used to generate direct-from-cells libraries)

103 100

gDNA input workflow: (i.e. used to generate libraries from gDNA)

Library complexity bench-top (x)

Human DNA fraction, bench-top

Figure 1 | Genomic sample preparation device operation and performance. (a) A photograph of the 96 36 nl microuidic sample preparation device

lled with food colouring to highlight features. Dime indicates scale (white bar indicates 1 cm). Inset: the reactor (red), lter (yellow) and reservoir (green) units. Black and red arrows designate reagent input ports and sample input/output ports, respectively. (b) The microuidic sample preparation workows for biomass input (extreme left) and gDNA input (top). (ce) Estimated sequence library complexity (units of fold-coverage) and mapping rate for (c) the clinical P. aeruginosa isolate gDNA samples mapped to PA-14 (of the 384 samples, eight individual replicates were lost completely during the barcoding step, leaving two replicates for each of these isolates. We know that these PCR drop-outs occurred due to faulty primers because these particular eight primer sets were subsequently observed to fail consistently across multiple samples). (d) Low-input E. coli biomass samples mapped to BL21-DE3 and (e) low-input M. tuberculosis biomass samples mapped to OFXR-14. (f) Comparison of microuidic and optimized bench-top sample preparation from low-input from soil micro-colony biomass (left, library complexity; right, human contamination). The library complexities were calculated using Picard tools (http://broadinstitute.github.io/picard/

Web End =http:// http://broadinstitute.github.io/picard/

Web End =broadinstitute.github.io/picard/ ) and the human DNA read fraction was determined using deconseq (http://deconseq.sourceforge.net/

Web End =http://deconseq.sourceforge.net/).

The combination of reactor re-use for an arbitrary number of process steps together with pull-down/purication capability makes the system largely protocol-independent since most next-generation sequencing (NGS) and cell-handling processes can be accomplished with some number of these elementary steps in combination.

Low-input DNA extraction and library construction. The extraction and library construction workow uses each reactor for ve different reaction steps (lysis, 2 DNA precipitation, tag

mentation and reaction stop) and the lter unit for three different capture steps (cell capture and 2 SPRI bead capture). Low

sample input capability requires high efciency lysis, purication and tagmentation, as well as contamination resistance to prevent small samples from being overwhelmed by extraneous DNA. We validated our microuidic platform by processing four different low-input samples; samples of B1,000 E. coli cells,

B10,000 M. tuberculosis cells, B10,000 cell soil micro-colony samples, and GC-rich genomic DNA samples from clinicalP. aeruginosa isolates (Supplementary Data 2).

The microuidic devices consistently converted 515% of input

gDNA into library molecules as estimated by qPCR measurement of output tagmentation products (Supplementary Fig. 3c). This value is signicantly higher than previously reported low-input library construction efciencies of 0.52% (ref. 35). Efciency estimates derived from the duplicate sequence read rate agreed with our qPCR analysis36 (Supplementary Note 3). Libraries constructed on-chip from 50100 picograms gDNA input showed reproducible mapping rates, fragment sizes and insert sizes, and k-mer frequencies consistent with standard high-input sample preparation (Fig. 1c, Supplementary Figs 3d and 7)37.

On-device cell lysis and DNA extraction steps were also found to be highly efcient. We measured 10% end-to-end conversion efciency and found good library quality when using an ultra-low

NATURE COMMUNICATIONS | 8:13919 | DOI: 10.1038/ncomms13919 | http://www.nature.com/naturecommunications

Web End =www.nature.com/naturecommunications 3

ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms13919

input of just 1,000 E. coli cells, representing sample and reagent inputs 200 times lower than standard low input protocols that specify 1 ng input (Fig. 1d; Supplementary Note 3). High efciency was also achieved for organisms like M. tuberculosis (10,000 cell input) and environmental micro-colonies that are known to be more challenging to lyse (Fig. 1e,f; Supplementary Note 4).

To test how contamination and sequence library quality compare at low input to conventional sample preparation, we carried out a matched comparison of environmental micro-colony sequence libraries prepared by the microuidic method with those prepared on the bench-top by an optimized low-input procedure using the same reagents in the same laboratory environment by the same operator (Supplementary Note 5).

For this comparison, we used soil samples from a private Boston garden grown using the iChip22 system, which cultures environmental microbes in situ to produce micro-colonies (Supplementary Fig. 8). We estimated the iChip micro-colonies to contain on the order of 100,000 cells each in total, and the input for sample preparation to be on the order of 10,000 cells (Supplementary Note 4). Each sample was split in half. One part was directly loaded and processed in our device (Fig. 1b and Supplementary Fig. 9), while the remaining portion was processed using an equivalent sample-processing technique at the bench-top. Comparing data from the two sample preparation methods produced from the same micro-colonies in the same HiSeq 2500 sequencing run, the on-device sample preparation yielded an average of twice the library complexity, half the coefcient of variation in complexity, no drop-outs and far lower human DNA contamination (Fig. 1f).

High-throughput library construction from clinical isolates. To test the potential for our approach to address the sample-processing bottleneck that limits application of microbial WGS analysis, we applied our system to process libraries from 124P. aeruginosa clinical isolates obtained from six randomly selected subjects from Brigham and Womens Hospital (Boston, MA). A total of 12 to 24P. aeruginosa colonies were isolated from each subject sample (Fig. 2a; Supplementary Data 3). To empirically determine the single-nucleotide polymorphism (SNP) false detection rate, colonies from two of the six subjects (P03 and P04) were each sampled from control plates representing expansions of individual primary colonies. We extracted genomic DNA from each culture, normalized the DNA concentrations and loaded the samples into our microuidic device for library construction. To enable analysis of any errors that occurred during sample preparation, sequencing or analysis38, we prepared triplicate technical replicate libraries from each DNA sample (Supplementary Note 6). A single pool of 384 barcodedP. aeruginosa and control libraries was sequenced across two HiSeq 2500 lanes (Supplementary Note 7 and 8).

Low-input libraries support high-specication SNP calling. In order to determine the genomic diversity across isolates and reliably detect sequence variants with important functional consequences like antibiotic resistance, sequencing libraries must not introduce systematic errors and maintain uniform genomic coverage. To compare our microuidic Pseudomonas libraries to those produced using bench-top methods, we instantiated an informatics pipeline for SNP calling (Supplementary Fig. 10) with an allelic fraction (AF) threshold as the primary adjustable parameter controlling sensitivity and specicity (Supplementary Fig. 11). We dene the AF as the fraction of quality-adjusted39 variant base counts at a given reference position. A lower AF threshold increases SNP calling sensitivity but increases false-

positive calls arising from errors that might occur in library construction, sequencing and read mapping operations. We found that setting the AF threshold to 0.82 limited the number of discordant SNPs detected across replicate libraries to zero (using discordant SNPs as a heuristic for erroneous variant calls38; Fig. 2b,c). To benchmark our parameterized pipeline, we analysed data from a previous study of SNPs in clinical Burkholderia dolosa isolates40 and successfully detected all the reported SNPs (Supplementary Data 4).

To validate the actual sensitivity and specicity of SNP calls in analysis of low-input microuidic sequence libraries versus bench-top libraries, we prepared several libraries from the same isolate sample: three low-input (50 pg) microuidic libraries and three high-input (24 ng) bench-top libraries, each sequenced to 50 coverage; and an additional high-input gold standard

bench-top library sequenced to 340 coverage. At 50 mean

coverage of the 50 pg microuidic libraries, no false-positive SNPs were detected versus the 340 high-input library, indicating that

accuracy was better than 1.4 10 7 when calling more than 99%

of reference positions. At equal mean coverage, there was no signicant decrease in the quality of SNP calls made from the picogram-input libraries produced in our device versus the nanogram-input bench-top libraries (Fig. 2d). In fact, the 50

low-input consensus base calling accuracy from the 50 pg input samples compares favourably to that reported for advanced single-strand consensus sequencing approaches that depend on extraordinarily deep sequence coverage and extra sample preparation steps (10 5; ref. 41). Although we expect replicate library sequencing to suppress rare sample preparation artifacts like PCR errors, our comparison revealed no signicant difference in the accuracy of replicate library sequencing compared with single-library sequencing at equivalent depth in our samples (Fig. 2d), indicating that artifacts from library construction affected variant calling at a frequency below 10 7.

Comparative genomics of clinical Pseudomonas isolates. In examining sequence variation between isolates, we identied 25,00090,000 SNP sites across isolates from different subjects (Fig. 2e), but isolates from the same subject were essentially clonal (Fig. 2f; Supplementary Data 5). We detected no SNP sites among three subjects isolates (P01, P02 and P06) and one of the control plates (P03). Among isolates from subject P05, just two SNP sites were identied, and only one SNP site was identied among isolates from control plate P04, with 2 of 24 isolates showing evidence for the same variant at this locus (Supplementary Note 9). Because the variant base in these two samples was reproducibly detected across the technical replicates in both affected subject P04 isolates (six sequence libraries in total), we can exclude sample preparation and sequencing errors as explanations for the variant calls. Rather, these calls likely arose from a mixture of strains being transferred to the control plate from overlapping colonies on the primary plate or a mutation that occurred during subsequent laboratory culture.

Our antibiotic resistance phenotype testing and gene content analysis likewise showed homogeneity across isolates from a given subject and extensive differences between isolates from different subjects (Fig. 3a and Supplementary Fig. 12; Supplementary Note 10). The core genes that were common in isolates between subjects were enriched for function in central dogma processes, while genes that varied between subjects were enriched for functions related to DNA transposition, recombination, restriction-modication, and transition metal response genes tied to mercury resistance (Supplementary Fig. 13; Supplementary Data 6 and 7; Supplementary Note 11). To fully validate the small number of variant calls made among the patient-specic isolate

4 NATURE COMMUNICATIONS | 8:13919 | DOI: 10.1038/ncomms13919 | http://www.nature.com/naturecommunications

Web End =www.nature.com/naturecommunications

NATURE COMMUNICATIONS | DOI: 10.1038/ncomms13919 ARTICLE

a b

Primary plate

Control plate

c d

Filtering threshold (allelic fraction)

Concordant SNP:

Reference: Rep1: Rep2: Rep3:

Discordant SNP:

Reference: Rep1: Rep2: Rep3:

Base calling accuracy of device libraries

Bench-top 24 ng deep seq: 53x

0 2.5 5

1-Specificity

Fraction of concordant SNPs

0.99

0.975

0.99

0.975

Sensitivity

0.965

0 0.35

0.175

Fraction of discordant SNPs

107

e f

*P04 *P03

P01 P06

P05 P02

100,000

10,000

1,000

100

*P04

P05

P02

89,234

82,002

80,073

80,996

81,778

P01

P06

53,646

54,681

55,101

54,749

25,368

27,431

27,181

27,861

26,204

*P03

25,359

Figure 2 | Pseudomonas aeruginosa clinical isolate sequencing and error correction. (a) Four types of samples were collected from six different subjects;(1) bronchial alveolar lavage, (2) sputum, (3) thoracostomy and (4) urine. Samples were streaked on primary plates and two control plates, which are expanded from single colonies. (b) Concordant SNP call sets are dened as call sets that agree among all technical replicate libraries. (c) The fraction of concordant and discordant SNPs among technical replicates (n 3) of one isolate from each subject (P01-4, P02-26, P03-58, P04-72, P05-105 and P06-

115, where the notation P01-4 means patient subject 1, sample 4) is plotted with decreasing AF stringency (minimum read depth xed at 6; minimum mapping quality xed at 45; excluding indels; see Methods section). Inset: thresholding at an AF value of 0.82 (circles) balances the maximization of concordant SNPs and minimization of discordant SNPs and was used for variant calling across all our samples. (d) Receiver operator characteristic (ROC) plot compares the accuracy and sensitivity in SNP calling among low-input libraries of sample P01-4 made in the device, on the bench-top libraries from individual replicates, from pooled triplicate replicates, and at different average coverage levels. With the AF threshold value at 0.82 (red circles), there is no meaningful difference between microuidic and bench-top libraries, or between single replicate and pooled triplicates at equal coverage. Due to differences in gene content and genomic structure of the reference and the subject P01 strain, 7% of the reference genome had no coverage in this analysis and 1.9% of the remaining sequence was masked due to poor average read mapping quality (MQo45; Supplementary Note 9). (e) Homology tree constructed based on the number of inter-subject SNPs. (f) Heat map of SNPs between each sample, grouped by the patient subject. Asterisks indicate control plate isolates that were expanded from single colonies. Numbers represent median number of SNPs in each subject pair block.

populations, the no-calls we made at loci with metrics near our analysis thresholds (Supplementary Fig. 14; Supplementary Data 5; Supplementary Note 9), and the variants responsible for drug resistance (described below), we performed Sanger sequencing, nding agreement with the short-read analysis in all cases.

WGS sequencing accurately predicts antibiotic resistance. Our sequence analysis of the clinical Pseudomonas isolates identied variant bases known to confer resistance to imipenem, ciprooxacin and ceftazidime, three antibiotics commonly used to treat pulmonary P. aeruginosa infections (Fig. 3a; Supplementary Data 8). All 24 isolates from subjects P01 and P04 had a

NATURE COMMUNICATIONS | 8:13919 | DOI: 10.1038/ncomms13919 | http://www.nature.com/naturecommunications

Web End =www.nature.com/naturecommunications 5

ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms13919

IPM CIP CAZ

Disc diffusion antibiotic susceptibility test

P01

P02

P03

P04

P05

P06

IPM

CIP

CAZ

Radius (mm)

mexR - V126E gyrA - T83I parC - S87L gyrB - S466Y

P01 P02 P03 P04 P05 P06

IPM

CIP

AntiSMASH analysis

b c

Phylotyping

100

312

190,402

287,545

AntiSMASH ESS

P. sp. GM74

P. sp. GM16P. fragi B25

P. sp. GM78

Gammaproteobacteria

Betaproteobacteria

P. chlororaphis GP72 Serratia proteamaculans S4

Variovorax sp. CF313

Sample ID

Cupriachelin

Colanic acid

Daptomycin

Delftibactin

Entolysin

Gobichelin

Isopropylstilbene

Indigoidine

AMB (Pseudomonas toxin)

APE Ec

APE Vf

Aerobactin

Amychelin

Albachelin

Auricin deoxysugar moieties

Bacillibactin

Bleomycin

Cichofactin

Leinamycin

Lipopolysaccharide

Mangotoxin

O-antigen

Oxazolomycin

Orfamide

Poaeamide

Vicibactin

WAP-8294A2 (lotilibcin)

Xantholysin

WS9326

Putisolvin

Pyoverdine

Scabichelin

Sessilin

Serobactins

Stenothricin

Streptomycin

Syringafactin

Taiwachelin

Tolaasin

Tallysomycin

Vanchrobactin

Vibriobactin

Red numbers = SNP counts

Figure 3 | Functional genomics from low-input microbial samples. (a) Antibiotic susceptibility phenotyping and genotyping. Antibiotic susceptibility was tested on a randomly sampled subset of isolates from each subject by the disc diffusion susceptibility assay. The drugs tested are IPM, CIP and CAZ (10 mg imipenem, 5 mg ciprooxacin and 30 mg ceftazidime, respectively). Raw images of one plate from each subject that was analysed (left). Plot shows inhibition zone radius for each antibiotic and known antibiotic resistance SNPs detected by WGS, grouped by subjects (right). The samples with smaller radii close to the Clinical & Laboratory Standards Institute (CLSI) break point (red line) indicate samples that are resistant to the specic antibiotic (grey points). Radii greater than the green line indicates samples identied as susceptible by the CLSI break point (blue points). The rst ten samples from subject P05 were measured on a separate day from the remaining P05 samples; the apparent difference in susceptibility in these samples and the remaining P05 samples is most likely due to a systematic difference in the assay (possibly image contrast) on the second day or degradation of the drug sample used between the two measurement sets (we classify all the P05 isolates tested as susceptible to ciprooxacin). (b) Phylotyping of the soil micro-colonies and (c) secondary metabolite class prediction using AntiSMASH analysis of de novo assembled genomic contigs. The clustering of samples in the phylotyping results is reected in the secondary metabolite predictions. The red numbers in (b) indicate the number of SNPs between the strains spanning each value. The heat map values in (c) represent empirical similarity scores11.

non-synonymous SNP, V126E, in the mexR (efux pump) gene consistent with previously reported imipenem resistance mutations42. Similarly, all isolates from subjects P01 and P03 had two SNPs, T83I in the gyrA (gyrase A) gene and S87L in the parC (topoisomerase IV) gene, each reported to yield ciprooxacin resistance43. All 12 isolates from subject P06 had a ciprooxacin resistance mutation, S466Y in the gyrB44 (gyrase B) gene, that was not found in the other subjects. In our drug resistance phenotype test45 (Fig. 3a; Supplementary Note 10), 100% of the phenotypically observed variability in resistance was explained by our low input, high-throughput WGS data (Fig. 3a).

Direct de novo WGS analysis of soil micro-colonies. Our microuidic platform showed data performance equal to bench-top library construction methods with orders of magnitude less input, lending the condence to carry out strain and natural

product prediction from the iChip micro-colony samples. We de novo assembled the genome of each micro-colony data set produced as described above (Supplementary Data 9) and phylotyped the micro-colonies at the strain level using a multi-locus sequence typing analysis46.

Of the 14 colonies analysed, 12 represented ve different known Pseudomonas strains (P. sp GM74, P. sp GM78, P. sp GM16, P. fragi B25 and P. chlororaphis GP72). We identied the remaining two colonies as Serratia proteamaculans S4 and Varivorax sp. CF313, respectively (Fig. 3b and Supplementary Fig. 15). Our high-GC Pseudomonas assemblies, based on low-input sample-processing direct from cells, were more contiguous than recently reported draft environmental Pseudomonas assemblies produced from sequence libraries prepared by standard, high-input methods47 (Supplementary Data 9). The device was also more reliable in producing libraries from our micro-colony samples, as two of the bench-top libraries dropped

6 NATURE COMMUNICATIONS | 8:13919 | DOI: 10.1038/ncomms13919 | http://www.nature.com/naturecommunications

Web End =www.nature.com/naturecommunications

NATURE COMMUNICATIONS | DOI: 10.1038/ncomms13919 ARTICLE

out entirely during the sample preparation process (Fig. 1f and Supplementary Fig. 16) and six of the fourteen bench-top libraries were at or below a library complexity of 200 genomic equivalents, the minimum required to enable 50 unique coverage with

acceptable read duplication rate.

Soil samples are frequently mined for natural products, recently yielding teixobactin, a promising agent under development for treating Gram-positive bacterial infections48. We analysed our micro-colony draft genomes for natural product genes using the anti-SMASH tool11 (Fig. 3c). While the metabolite-producing gene cluster proles between the device libraries and bench-top libraries mostly agreed (except where the bench-top samples dropped out entirely), the Serratia and Varivorax samples gave divergent results, with the higher-quality microuidic libraries likely producing more accurate gene calls (Supplementary Fig. 16).

DiscussionDespite the ability of Illuminas HiSeq and NextSeq platforms to process hundreds of microbial sequence libraries per run, genomic analysis is underutilized in epidemiology, clinical care and natural product discovery due to the complexity, limited throughput, labour-intensity, and input quantity requirements of available sample preparation methods. Here, we piloted an automated microuidic system that integrates at high throughput all the key steps in NGS sample preparation: cell concentration, lysis, fragmentation, adaptor tagging, fragment purication and size selection (Supplementary Fig. 2; Supplementary Data 2, 10 and 11). By lowering input requirements 200-fold, our system also enables new applications such as rapid analysis of slow-growing pathogen isolates and direct analyses of samples with limited biomass.

We constructed nearly 400 WGS sequence libraries from 124 isolates of P. aeruginosa collected from six patients. Our analysis revealed tremendous inter-subject diversity in gene content and genome sequence even though all six samples were collected from the same hospital within a period of a few weeks. Despite the dynamic accessory genome known in Pseudomonas, we found negligible intra-subject variation in our samples, indicating that the infected sites contained essentially clonal Pseudomonas populations, not the diverse Pseudomonas populations often described in the chronic pulmonary infections common among cystic brosis patients (Supplementary Note 12). The population signature we observed is consistent with community-acquired infection or infection of these patients from their own microbiota in the hospital setting.

The genomic determinants of imipenem and ciprooxacin resistance were apparent in our variant calls and concordant with phenotypic susceptibility assays. We also found substantial gene content and sequence variation across strains with the same multiple resistance phenotype, indicating that these strains most likely acquired resistance to the same set of drugs independently. As is often found in Pseudomonas, we noticed diversity in the gene content across the isolates from different patients, notably enrichment of DNA integration and restriction functions in the variable genome that may be related to horizontal gene transfer, subject-specic prophage (Supplementary Figs 17 and 18) and other putatively exogenous sequences (Supplementary Fig. 19; Supplementary Data 12; Supplementary Note 13), as well as mercury resistance elements, which have previously been linked to the development of antibiotic resistance49,50.

These analyses were enabled by the Q68 SNP calling

performance achieved by the tuned variant caller38. Replicate sequencing that enabled this performance increases the need for automation of sample preparation and low-input sample

preparation, but has the power to discriminate sequencing errors from errors introduced during library construction including base damage, PCR-derived mutations, and chimeras.

We demonstrated that the microuidic systems integrated DNA extraction capability works at very low-input levels for challenging sample types like M. tuberculosis cells and soil micro-colonies. This eliminates the need for pre-amplication by whole-genome amplication, which increases workow complexity and degrades data quality. The input reduction to thousands of cells could cut diagnostic test times in half for slow-growing pathogens like M. tuberculosis. Lowering input requirements also enables new organism and natural product discovery by direct WGS analysis of environmental microbes that are hard to culture, such as the iChip micro-colonies. Side-by-side comparison of soil micro-colony WGS analysis with an optimized bench-top procedure makes clear the superior performance of the new microuidic method in reliability and critical data metrics. Future synthetic biology approaches51 to synthesize natural product genes for expression in industrial production strains would place even higher demands on sequence and assembly quality. Re-discovery of known natural products is a limiting factor in the discovery of novel bioactive compounds52, placing a premium on isolating the most challenging-to-grow organisms that are enriched in diversity versus known microbes.

We expect the increased automation, throughput and low-input capability of our microuidic library construction method to enable a wide variety of future applications. Current library preparation protocols require nanograms to micrograms of DNA that have been previously extracted from cells. Some protocols require that DNA be pre-fragmented and size-selected in addition. Eliminating such challenging input requirements could expand the use of less invasive but low-yield sampling and biopsy procedures, enable direct pathogen identication in tiny microbiome samples, better enrichment of cell types or tissue regions of interest in clinical micro-samples, and eliminate whole-genome pre-amplication from some workows53. The high throughput and high accuracy sample preparation method presented here will power applications including precision medicine, genomic surveillance, antibiotic resistance tracking25 and novel organism/natural product discovery on large scales.

Methods

Microuidic device fabrication. Microuidic devices were fabricated by multilayer soft lithography of PDMS, a transparent silicone elastomer, on a mould comprised of a silicon wafer patterned with photoresist. Mould and device fabrication were carried out at the Broad Institute. Separate moulds were used to cast a control layer of height 50 mm and a ow layer of height 6 mm. The two layers were partially cured, aligned manually and thermally bonded by further curing. Inlet ports were punched and the two-layer PDMS device was bonded to a glass slide after activation by air plasma exposure.

The ow layer mould was patterned in ve steps: (1) rectangular 15 mm,(2) rectangular 5 mm, (3) rounded 15 mm, (4) rectangular 30 mm and (5) rectangular 100 mm features. The rectangular features were made by spin coating SU-8 2001 (5 mm), 2015 (15 and 30 mm) and 2050 (100 mm) photoresists (Microchem) on a silicon wafer. The coated wafers were patterned by ultraviolet exposure (OAI 206 mask aligner) through a mask printed at 20,000 dpi (Fineline; see Autocad design les for each layer in supplementary material), followed by feature development in SU-8 developer (Microchem). The rounded features were produced by spin coating AZ-4620 photoresist (Microchem) after coating with Hexamethyldisilazane (Sigma) and air-drying, patterning the wafer with UV exposure and a mask, developing in the AZ 400 K developer (Microchem), and slowly heating (from 65 C to 190 C over 4 h) the wafer above the resist Tg (120 C) to round the features. The control layer mould was patterned in two steps: (1) rectangular 15 mm and(2) rectangular 90 mm features, using methods similar to those for the ow layer. Device production utilized standard soft lithography protocols. PDMS

(Momentive) was mixed at 5:1 silicone to cross-linker ratio in a Thinky AR250 mixer, poured onto the ow layer mould, degassed in a vacuum chamber and cured by baking for 1 h at 80 C. PDMS at a 20:1 silicone to cross-linker ratio was spin coated on the valve layer mould to a height of 50 mm, then baked at 80 C for40 min. The two layers were aligned under a stereomicroscope (Nikon), and further baked for 2 h at 80 C to complete thermal bonding. Inlet holes were punched into

NATURE COMMUNICATIONS | 8:13919 | DOI: 10.1038/ncomms13919 | http://www.nature.com/naturecommunications

Web End =www.nature.com/naturecommunications 7

ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms13919

the two-layered PDMS device (Syneo, ID 660 mm tips) on the designated input/output port features. The device was then exposed to atmospheric plasma for 30 s at a pressure of 1.3 mbar (Diener ATTO), bonded to a clean glass slide and baked for 3 h at 80 C.

Device design le and controller. Flow and control layers were pneumatically activated by an array of 40 solenoid valves (Pneumadyne) controlled by a USB interface (McMaster) connected to a computer. The device controller was home-built following designs and procedures specied by Dr Rafael Gmez-Sjberg and the Prof. Stephen Quake group at Stanford University as described at: https://sites.google.com/site/rafaelsmicrofluidicspage/valve-controllers/usb-based-controller

Web End =https://sites.google.com/site/rafaelsmicrouidicspage/valve-controllers/usb-based- https://sites.google.com/site/rafaelsmicrofluidicspage/valve-controllers/usb-based-controller

Web End =controller .

The valve operation pressure was 40 PSI and sample/reagent solutions were driven at 20 PSI.

The device CAD le is publicly available on: https://sourceforge.net/projects/sk-dev-cad-analysis-software/files/Kim_supplementaryfiles.zip

Web End =https://sourceforge.net/projects/ https://sourceforge.net/projects/sk-dev-cad-analysis-software/files/Kim_supplementaryfiles.zip

Web End =sk-dev-cad-analysis-software/les/Kim_supplementaryles.zip /download.

Clinical P. aeruginosa isolate sample collection and culture. The clinical samples used in this study were discards from the Brigham and Womens Hospital microbiology lab, and were disconnected from patient meta-data. Samples from selected sites of infection were streaked on selective media plates containing MacConkey agar. 12 to 24 P. aeruginosa colonies were identied by appearance and randomly picked (Fig. 2a). P. aeruginosa isolates were received at the Broad Institute as frozen liquid cultures. They were grown to mid-log phase at a concentration of 5 108 cells ml 1 in lysogeny broth (LB). Culture density was

monitored by OD600 using a UVvis spectrophotometer.

DNA extraction from clinical isolates (bench-top method). DNA was puried from bacterial cultures using Qiagen DNeasy Blood and Tissue kits on the QIAcube instrument (Qiagen). The cell input was between one and two million cells per sample. After extraction, the concentration of puried gDNA was measured by absorption at 260 nm (Nanodrop) and normalized to 20 ng ml.

M. tuberculosis culture. Two M. tuberculosis clinical isolates (OFXR14 and OFXR16) were grown in Middlebrook 7H9 medium supplemented with OADC (Becton Dickinson), 0.05% Tween 80 (Sigma) and 0.2% glycerol at 37C with shaking. Samples were taken at OD600 0.05 to 0.1, which corresponds to B1.53 107 bacterial cells per ml. A 100 ml culture sample from each isolate was

heat inactivated at 80 C for 2 h, ash frozen and used for all analyses.

iChip soil bacteria culture. A 1 g sample of soil collected from a private garden in Boston, MA was agitated vigorously for 10 min in 10 ml sterile phosphate buffered saline (PBS). The soil was left to settle for 5 min and the supernatant was diluted again in PBS. Dilutions were then mixed with molten agar media (0.1 g starch, 1.0 g casamino acids, 15 g technical agar per 1 l of H2O). The soil suspension was diluted further with agar to achieve an average concentration of one cell per 100 ml. Then 100 ml of the soil-agar suspension was dispensed into the each well of an iChip22, which had a 0.03 mm polycarbonate membrane attached to the bottom via silicone glue. After the agar suspension was solid, a 0.03 mm polycarbonate membrane was attached to the top of the iChip using silicone glue. The iChip was incubated in direct contact with moist soil in the dark for 2 weeks. After 2 weeks, iChips were disassembled and individual colonies were picked with sterile toothpicks using a dissection microscope, and placed into 15 ml PBS for further analysis.

Tagmentation (bench-top method, 24 ng input). Bench-top library construction was done following the Illumina Nextera protocol for tagmentation. 5 ml of4.8 ng ml 1 (24 ng total) of puried gDNA was mixed with 15 ml of Nextera enzyme (Illumina), 5 ml of 5 Tagmentation buffer (50 mM Tris-HCl pH 8.0

(Sigma), 20 mM MgCl2 (Sigma)) and 10 ml of H2O, then incubated for 10 min at 58 C. 2.5 ml of stop solution (2.5% wt per vol SDS (Sigma) in H2O) was added to the mixture, followed by incubation for 10 min at 72 C. Tagmented DNAwas puried by mixing with 49.5 ml SPRI bead suspension (Beckman-Coulter), binding for 10 min at 25 C, magnetically separating the beads, washing twice with ethanol and eluting the product off the beads with 50 ml of 10 mM Tris-HCl pH 8.0.

Commercial DNA extraction method (bench-top bead beating). We attempted using the MoBio Power-soil kit (Qiagen) with 7 ml of cells (B104 total cells) to extract gDNA for library construction but did not obtain a sufcient quantity of tagmented DNA fragments after library construction (o1 picograms of ampliable tagmented product; data not shown).

Library quantication (qPCR). Before enrichment PCR, an aliquot from each library produced on the device was quantied by qPCR to verify successful library construction. qPCR was performed by mixing 1 ml of tagmented product with 0.5 ml

Eva green dye (Evrogen), 0.5 ml of Rox reference dye (ROCHE), 0.5 ml of N12 Nextera barcoding primer (Illumina), 0.5 ml of E502 Nextera barcoding primer

(Illumina), 4 ml of DNA polymerase ready mix (Illumina) and 3 ml of H2O, then performing qPCR in a real-time thermocycler (Stratagene MX3005p). The thermal programme was: 5 min at 72 C, 1 min at 95 C, then 40 cycles of (1) 10 s at 95 C,(2) 30 s at 60 C and (3) 30 s at 72 C. To quantify the properly adapted tagmented library molecules in each sample the qPCR amplication curve of each sample was compared with the curves resulting from analysis of puried standards (amplied, size-selected library) of similar average molecular weight and GC content. The reference libraries were quantied using the Qubit method (ThermoFisher) and Kapa library quantication kits (Kapa Biosystems).

Enrichment/barcoding PCR and sample pooling. The sequencing libraries created on the microuidic devices were barcoded using the Broad Institutes dual barcoding primers (Broad Genomics Platform), for which validation was not yet complete. Eight libraries were lost in the enrichment PCR step from the Pseuodomonas set due to bad primers.

4 ml of each tagmented sample library (B5 pg) was mixed with 0.5 ml of primer 1, 0.5 ml of primer 2, 4 ml of amplication mix (Illumina) and 1 ml of H2O. Our libraries were amplied by 15 cycles of PCR (optimized for 5 pg input),

SPRI puried and Qubit quantied. The libraries were then pooled into asingle mix in equal concentrations based on their Qubit-measured concentrations.

DNA sequencing. Quality assessment runs (library construction efciency, sensitivity versus specicity, k-mer abundance and fragment size distribution) and production runs of test libraries were carried out by sequencing across variably congured Illumina MiSeq runs (2 50, 2 75, 2 150, 2 250 and 2 300). The

quality assessed samples were sequenced on the Illumina Hiseq 2500 platform (2 125 cycles or 2 101 cycles) in the Broad Genomics Platform.

Antibiotic susceptibility testing. Clinical Pseudomonas aeruginosa isolates were grown to mid-log phase in Luria broth (LB), and 100 ml of each culture was spread on an LB agar plate and allowed to dry for 15 min face-up at room temperature. Disks impregnated with ceftazidime (30 mg, Becton Dickinson), ciprooxacin (5 mg,

Becton Dickinson) or imipenem (10 mg, Becton Dickinson) were placed at evenly spaced distances on these plates, which were then incubated for 16 h at 37 C. Plates were imaged using a FluorChem FC2 system (Alpha-Innotech) set to the visible range, and the radius of the zone of inhibition around each disk was measured using a custom Matlab image analysis script. These radii were compared with standards determined by the Clinical Laboratory Standards Institute (CLSI). The breakpoints for each antibiotic are as follow (S and R stand for susceptible and resistant, respectively):

(1) Ciprooxacin: S421 mm; Ro15 mm(2) Imipenem: S416 mm; Ro13 mm(3) Ceftazidime: S418 mm; Ro14 mm

Analysis software and commands. Software and commands used for sequencing analysis are available in the supplementary methods. Custom analysis software is publicly available on https://sourceforge.net/projects/sk-dev-cad-analysis-software/files/Kim_supplementaryfiles.zip/download

Web End =https://sourceforge.net/projects/sk-dev-cad-analysis-software/ https://sourceforge.net/projects/sk-dev-cad-analysis-software/files/Kim_supplementaryfiles.zip/download

Web End =les/Kim_supplementaryles.zip/download .

Data availability. Primary accessions. National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA)clinical P. aeruginosa raw reads fastq PRJNA295070 clinical P. aeruginosa de novo assemblies fasta PRJNA295070 low-input E. coli raw reads fastq PRJNA295070M. tuberculosis raw reads fastq PRJNA295070 soil micro-colonies raw reads fastq PRJNA295070 Referenced accessions. NCBI Reference sequence

Pseudomonas aeruginosa UCBPP-PA14 complete genome NC_008463.1 Mycobacterium tuberculosis OFXR-14 scaffold GCF_000660185.1 Mycobacterium tuberculosis OFXR-16 scaffold GCF_000660225.1

References

1. Yatsunenko, T. et al. Human gut microbiome viewed across age and geography. Nature 486, 222227 (2012).

2. Marvig, R. L., Sommer, L. M., Molin, S. & Johansen, H. K. Convergent evolution and adaptation of Pseudomonas aeruginosa within patients with cystic brosis. Nat. Genet. 47, 5764 (2014).

3. Dallman, T. J. et al. Whole-genome sequencing for national surveillance of shiga toxin-producing Escherichia coli O157. Clin. Infect. Dis. 61, 305312 (2015).

4. Nasser, W. et al. Evolutionary pathway to increased virulence and epidemic group A Streptococcus disease derived from 3,615 genome sequences. Proc. Natl Acad. Sci. USA 111, E1768E1776 (2014).

5. Chewapreecha, C. et al. Dense genomic sampling identies highways of pneumococcal recombination. Nat. Genet. 46, 305309 (2014).

8 NATURE COMMUNICATIONS | 8:13919 | DOI: 10.1038/ncomms13919 | http://www.nature.com/naturecommunications

Web End =www.nature.com/naturecommunications

NATURE COMMUNICATIONS | DOI: 10.1038/ncomms13919 ARTICLE

6. Smith, E. E. et al. Genetic adaptation by Pseudomonas aeruginosa to the airways of cystic brosis patients. Proc. Natl Acad. Sci. USA 103, 84878492 (2006).

7. Snitkin, E. S. et al. Tracking a hospital outbreak of carbapenem-resistant Klebsiella pneumoniae with whole-genome sequencing. Sci. Transl. Med. 4, 148ra116 (2012).

8. Centers for Disease Control and Prevention Antibiotic-Resistant Gonorrhea -STD information from CDC. Available at https://www.cdc.gov/std/gonorrhea/arg/basic.htm

Web End =https://www.cdc.gov/std/gonorrhea/ https://www.cdc.gov/std/gonorrhea/arg/basic.htm

Web End =arg/basic.htm .

9. Obama, B. Executive Order--Combating Antibiotic-Resistant Bacteria | whitehouse.gov. the White house (2014). Available at https://www.whitehouse.gov/the-press-office/2014/09/18/executive-order-combating-antibiotic-resistant-bacteria>

Web End =https://www.whitehouse. https://www.whitehouse.gov/the-press-office/2014/09/18/executive-order-combating-antibiotic-resistant-bacteria>

Web End =gov/the-press-ofce/2014/09/18/executive-order-combating-antibiotic-resistant- https://www.whitehouse.gov/the-press-office/2014/09/18/executive-order-combating-antibiotic-resistant-bacteria>

Web End =bacteria& https://www.whitehouse.gov/the-press-office/2014/09/18/executive-order-combating-antibiotic-resistant-bacteria>

Web End =gt; .

10. Newman, D. J. & Cragg, G. M. Natural products as sources of new drugs over the 30 years from 1981 to 2010. J. Nat. Prod. 75, 311335 (2012).

11. Weber, T. et al. antiSMASH 3.0--a comprehensive resource for the genome mining of biosynthetic gene clusters. Nucleic Acids Res. 43, W237W243 (2015).

12. Kim, H. et al. A microuidic DNA library preparation platform for next-generation sequencing. PLoS ONE 8, e68988 (2013).

13. de Bourcy, C. F. A. et al. A quantitative comparison of single-cell whole genome amplication methods. PLoS ONE 9, e105585 (2014).

14. Kram, K. E. & Finkel, S. E. Rich medium composition affects Escherichia coli survival, glycation, and mutation frequency during long-term batch culture. Appl. Environ. Microbiol. 81, 44424450 (2015).

15. Blainey, P. C. The future is now: single-cell genomics of bacteria and archaea. FEMS Microbiol. Rev. 37, 407427 (2013).

16. Salter, S. J. et al. Reagent and laboratory contamination can critically impact sequence-based microbiome analyses. BMC Biol. 12, 87 (2014).

17. Jones, M. B. et al. Library preparation methodology can inuence genomic and functional predictions in human microbiome research. Proc. Natl Acad. Sci. USA 112, 1402414029 (2015).

18. Brooks, J. P. et al. The truth about metagenomics: quantifying and counteracting bias in 16S rRNA studies. BMC Microbiol. 15, 66 (2015).

19. Witney, A. A. et al. Clinical use of whole genome sequencing for Mycobacterium tuberculosis. BMC Med. 14, 46 (2016).

20. Wilson, M. R. et al. Actionable diagnosis of neuroleptospirosis by next-generation sequencing. N. Engl. J. Med. 370, 24082417 (2014).

21. Naccache, S. N. et al. A cloud-compatible bioinformatics pipeline for ultrarapid pathogen identication from next-generation sequencing of clinical samples. Genome Res. 24, 11801192 (2014).

22. Nichols, D. et al. Use of ichip for high-throughput in situ cultivation of "uncultivable" microbial species. Appl. Environ. Microbiol. 76, 24452450 (2010).

23. Anderson, S. Shotgun DNA sequencing using cloned DNase I-generated fragments. Nucleic Acids Res. 9, 30153027 (1981).

24. Ramskld, D. et al. Full-length mRNA-Seq from single-cell levels of RNA and individual circulating tumor cells. Nat. Biotechnol. 30, 777782 (2012).

25. Buenrostro, J. D., Giresi, P. G., Zaba, L. C., Chang, H. Y. & Greenleaf, W. J. Transposition of native chromatin for fast and sensitive epigenomic proling of open chromatin, DNA-binding proteins and nucleosome position. Nat. Methods 10, 12131218 (2013).

26. Robertson, G. et al. Genome-wide proles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing. Nat. Methods 4, 651657 (2007).

27. Melin, J. & Quake, S. R. Microuidic large-scale integration: the evolution of design rules for biological automation. Annu. Rev. Biophys. Biomol. Struct. 36, 213231 (2007).

28. Morinishi, L. S. & Blainey, P. Simple bulk readout of digital nucleic acid quantication assays. J. Vis. Exp. e52925 (2015).

29. Caruccio, N. Preparation of next-generation sequencing libraries using Nextera technology: simultaneous DNA fragmentation and adaptor tagging by in vitro transposition. Methods Mol. Biol. 733, 241255 (2011).

30. DeAngelis, M. M., Wang, D. G. & Hawkins, T. L. Solid-phase reversible immobilization for the isolation of PCR products. Nucleic Acids Res. 23, 47424743 (1995).

31. Kim, S. et al. High-throughput single-molecule optouidic analysis. Nat. Methods 8, 242245 (2011).

32. Bhattacharyya, A. & Klapperich, C. M. Thermoplastic microuidic device for on-chip purication of nucleic acids for disposable diagnostics. Anal. Chem. 78, 788792 (2006).

33. Tan, S. J. et al. A microuidic device for preparing next generation DNA sequencing libraries and for automating other laboratory protocols that require one or more column chromatography steps. PLoS ONE 8, e64084 (2013).

34. Balagadd, F. K., You, L., Hansen, C. L., Arnold, F. H. & Quake, S. R. Long-term monitoring of bacteria undergoing programmed population control in a microchemostat. Science 309, 137140 (2005).

35. Parkinson, N. J. et al. Preparation of high-quality next-generation sequencing libraries from picogram quantities of target DNA. Genome Res. 22, 125133 (2012).

36. Li, H. Mathematical Notes on SAMtools Algorithms. (2010). Available at <https://www.broadinstitute.org/gatk/media/docs/Samtools.pdf>

Web End =https://www.broadinstitute.org/gatk/media/docs/Samtools.pdf .

37. Kelley, D. R., Schatz, M. C. & Salzberg, S. L. Quake: quality-aware detection and correction of sequencing errors. Genome Biol. 11, R116 (2010).

38. Robasky, K., Lewis, N. E. & Church, G. M. The role of replicates for error mitigation in next-generation sequencing. Nat. Rev. Genet. 15, 5662 (2014).

39. Walker, B. J. et al. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS ONE 9, e112963 (2014).

40. Lieberman, T. D. et al. Genetic variation of a bacterial pathogen within individuals with cystic brosis provides a record of selective pressures. Nat. Genet. 46, 8287 (2014).

41. Schmitt, M. W. et al. Detection of ultra-rare mutations by next-generation sequencing. Proc. Natl Acad. Sci. USA 109, 1450814513 (2012).

42. Pai, H., Kim, J., Lee, J. H., Choe, K. W. & Gotoh, N. Carbapenem resistance mechanisms in Pseudomonas aeruginosa clinical isolates. Antimicrob. Agents Chemother. 45, 480484 (2001).

43. Lomholt, J. A. & Kilian, M. Ciprooxacin susceptibility of Pseudomonas aeruginosa isolates from keratitis. Br. J. Ophthalmol. 87, 12381240 (2003).

44. Bruchmann, S., Dtsch, A., Nouri, B., Chaberny, I. F. & Haussler, S. Quantitative contributions of target alteration and decreased drug accumulation to Pseudomonas aeruginosa uoroquinolone resistance. Antimicrob. Agents Chemother. 57, 13611368 (2013).

45. Bauer, A. W., Kirby, W. M., Sherris, J. C. & Turck, M. Antibiotic susceptibility testing by a standardized single disk method. Am. J. Clin. Pathol. 45, 493496 (1966).

46. Darling, A. E. et al. PhyloSift: phylogenetic analysis of genomes and metagenomes. PeerJ 2, e243 (2014).

47. Winsor, G. L. et al. Enhanced annotations and features for comparing thousands of Pseudomonas genomes in the Pseudomonas genome database. Nucleic Acids Res. 44, D646D653 (2015).

48. Ling, L. L. et al. A new antibiotic kills pathogens without detectable resistance. Nature 517, 455459 (2015).

49. Skurnik, D. et al. Is exposure to mercury a driving force for the carriage of antibiotic resistance genes? J. Med. Microbiol. 59, 804807 (2010).

50. Rowland, I. R., Robinson, R. D. & Doherty, R. A. Effects of diet on mercury metabolism and excretion in mice given methylmercury: role of gut ora. Arch. Environ. Health 39, 401408 (1984).

51. Smanski, M. J. et al. Synthetic biology to access and expand natures chemical diversity. Nat. Rev. Microbiol. 14, 135149 (2016).

52. Baltz, R. H. Antimicrobials from actinomycetes: back to the future. Microbe Am. Soc. Microbiol. 2, 125 (2007).

53. Fitzsimons, M. S. et al. Nearly nished genomes produced using gel microdroplet culturing reveal substantial intraspecies genomic diversity within the human microbiome. Genome Res. 23, 878888 (2013).

Acknowledgements

We thank Dr Yonatan Grad, Dr Deb Hung, Dr Ashlee Earl, Dr Christopher Desjardins, Dr Robert Lintner, Mr Bruce Walker, Mr Terrance Shea, Ms Sheila Fisher, Dr Niall Lennon, Dr Stacy Gabriel and members of the Blainey Lab for helpful discussions. The indexing barcodes were provided by the Broad Institute Genomics Platform. We also thank Dr Lynn Bry at Brigham and Womens Hospital regarding sample collection and helpful discussions. This work was supported by the Burroughs Welcome Fund via a Career Award at the Scientic Interface to PCB, Broad Institute Startup Funds and an Emerging Technologies Opportunity Program award from the Department of Energy Joint Genome Institute. S.K. was supported by the National Science Foundation Postdoctoral Fellowship in Biology 1308852.

Author contributions

S.K. and P.C.B. designed the microuidic device and planned experiments. S.K. fabricated the device, built the experimental hardware, performed the necessary experiments and optimization of on-device sample preparation, and processed the micro-colony samples and M. tuberculosis samples into sequence libraries. S.K. and A.B.K. wrote the hardware control software. S.K. and J.D.J. processed the clinical isolates in the device. S.K., J.D.J. and J.N. performed the enrichment PCR on the bench. S.K., A.B.K. and D.F. performed SNP analysis. S.K., T.V. and J.D.J. performed the gene content analysis and the variable genome content analysis. S.K. and R.P.B. performed the Pseudomonas antibiotic resistance analysis. S.K. and D.F. analysed soil micro-colony sequences. B.B. and S.E. provided soil micro-colony samples from the iChip. J.G. provided M. tuberculosis samples. S.K., A.B.K., D.F. and P.C.B. wrote the manuscript.

Additional information

Supplementary Information accompanies this paper at http://www.nature.com/naturecommunications

Web End =http://www.nature.com/ http://www.nature.com/naturecommunications

Web End =naturecommunications

Competing nancial interests: The Broad Institute and MIT may seek to le for intellectual property on and/or commercialize aspects of this work. J.D.J., A.B.K., D.F., T.V., R.P.B., B.B., J.G., J.N. and S.E. declare no competing interests.

NATURE COMMUNICATIONS | 8:13919 | DOI: 10.1038/ncomms13919 | http://www.nature.com/naturecommunications

Web End =www.nature.com/naturecommunications 9

ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms13919

Reprints and permission information is available online at http://npg.nature.com/reprintsandpermissions/

Web End =http://npg.nature.com/ http://npg.nature.com/reprintsandpermissions/

Web End =reprintsandpermissions/

How to cite this article: Kim, S. et al. High-throughput automated microuidic sample preparation for accurate microbial genomics. Nat. Commun. 8, 13919 doi: 10.1038/ ncomms13919 (2017).

Publishers note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional afliations.

This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the articles Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

Web End =http://creativecommons.org/licenses/by/4.0/

r The Author(s) 2017

10 NATURE COMMUNICATIONS | 8:13919 | DOI: 10.1038/ncomms13919 | http://www.nature.com/naturecommunications

Web End =www.nature.com/naturecommunications

Word count: 8906

Show less

Abstract

Translate

Low-cost shotgun DNA sequencing is transforming the microbial sciences. Sequencing instruments are so effective that sample preparation is now the key limiting factor. Here, we introduce a microfluidic sample preparation platform that integrates the key steps in cells to sequence library sample preparation for up to 96 samples and reduces DNA input requirements 100-fold while maintaining or improving data quality. The general-purpose microarchitecture we demonstrate supports workflows with arbitrary numbers of reaction and clean-up or capture steps. By reducing the sample quantity requirements, we enabled low-input (∼10,000 cells) whole-genome shotgun (WGS) sequencing of Mycobacterium tuberculosis and soil micro-colonies with superior results. We also leveraged the enhanced throughput to sequence ∼400 clinical Pseudomonas aeruginosa libraries and demonstrate excellent single-nucleotide polymorphism detection performance that explained phenotypically observed antibiotic resistance. Fully-integrated lab-on-chip sample preparation overcomes technical barriers to enable broader deployment of genomics across many basic research and translational applications.

Details

Title

High-throughput automated microfluidic sample preparation for accurate microbial genomics

Author

Kim, Soohong; De Jonghe, Joachim; Kulesa, Anthony B; Feldman, David; Vatanen, Tommi; Bhattacharyya, Roby P; Berdy, Brittany; Gomez, James; Nolan, Jill; Epstein, Slava; Blainey, Paul C

Pages

13919

Publication year

2017

Publication date

Jan 2017

Publisher

Nature Publishing Group

e-ISSN

20411723

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.1038/ncomms13919

ProQuest document ID

1862125468

High-throughput automated microfluidic sample preparation for accurate microbial genomics

Jump to:

Full text

Abstract

Details

Suggested sources