Detection of CRISPR Adaptation

Russian and American scientists have written a recent review on CRISPR technology. To read more about
CRISPR see Dr. Jennifer Doudna’s laboratory here.

AUTHORS:

Anna Shiriaeva 1 2Ivan Fedorov 1 3Danylo Vyhovskyi 1Konstantin Severinov 1 2 4

AUTHOR AFFILIATIONS:

1 Center of Life Sciences, Skolkovo Institute of Science and Technology, Moscow 121205, Russia.

2 Waksman Institute, Rutgers, the State University of New Jersey, Piscataway, NJ 08854, U.S.A.

3 Institute of Gene Biology, Russian Academy of Sciences, Moscow 119334, Russia.

4 Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Institute of Gene Biology, Russian Academy of Sciences, Moscow 119334, Russia.

ABSTRACT:

Prokaryotic adaptive immunity is built when short DNA fragments called spacers are acquired into CRISPR (clustered regularly interspaced short palindromic repeats) arrays. CRISPR adaptation is a multistep process which comprises selection, generation, and incorporation of prespacers into arrays. Once adapted, spacers provide immunity through the recognition of complementary nucleic acid sequences, channeling them for destruction. To prevent deleterious autoimmunity, CRISPR adaptation must therefore be a highly regulated and infrequent process, at least in the absence of genetic invaders. Over the years, ingenious methods to study CRISPR adaptation have been developed. In this paper, we discuss and compare methods that detect CRISPR adaptation and its intermediates in vivo and propose suppressing PCR as a simple modification of a popular assay to monitor spacer acquisition with increased sensitivity.

INTRODUCTION:

CRISPR-Cas are diverse (two classes, six types [1–3]) prokaryotic adaptive immunity systems that protect cells from phages and other mobile genetic elements (MGEs) [4,5]. They consist of CRISPR arrays and CRISPR-associated cas genes [6,7]. CRISPR arrays are composed of identical or highly similar repeats separated by unique DNA sequences called spacers [6,7]. The total number of spacers in array varies from one to several hundreds [6,8]. The source of the vast majority (∼93%) of spacers remains unknown, they constitute the ‘dark matter' of CRISPR [9]. Most of the remaining spacers map to MGEs and can be regarded as memories of prior encounters that cells store in CRISPR arrays [9]. Upstream of the CRISPR loci, there is an AT-rich sequence called the ‘leader' [7]. CRISPR arrays are transcribed from a promoter located in the leader and the primary transcript is processed into CRISPR RNAs (crRNAs) containing a single spacer and flanking sequences derived from repeats [10–17]. Cas proteins together with crRNAs form effector complexes (Cascade complex in the type I-E system of Escherichia coli) that recognize ‘protospacers' — DNA or, sometimes, RNA sequences, complementary to a crRNA spacer [13,18–20]. Recognition of protospacers in MGEs leads to their destruction [18–20]. CRISPR immunity is built during CRISPR adaptation, a process which entails incorporation of new spacers in the array [4]. New spacers are typically incorporated at the boundary between the leader and the first repeat and, therefore, the chronological order of spacer acquisition matches the inverse order of spacers in the array [4,21,22]. For every acquired spacer, a new copy of repeat is generated [4,21,22]. Two most conserved Cas proteins, Cas1 and Cas2, common to almost all CRISPR-Cas systems, catalyze integration of spacer precursors (prespacers) into arrays [23–25]. Generally, the acquisition of spacers is not specifically targeted to MGEs and thus spacers from cell's own genome can also be acquired [23,26]. This can result in auto-immune response inhibiting cell growth [27–29]. Not surprisingly, CRISPR adaptation is a tightly controlled process that normally proceeds with very low efficiency and can be difficult to detect both in natural settings and in laboratory experiments. Several methods for the detection of CRISPR adaptation have been developed and helped to shed light on molecular mechanisms governing spacer choice. These methods and their limitations are discussed below.

Selection-based methods of detection of CRISPR adaptation in individual cells or clones

An obvious way to detect the acquisition of a new spacer is to amplify the leader-proximal end of CRISPR array with a pair of primers: one matching the leader, and another matching an internal, pre-existing spacer [23,30,31]. Since new spacers are usually incorporated in front of the first, leader-proximal repeat, and result in repeat duplication [4,21–23], detection of PCR-products extended by integral number of spacer-repeat units reveals CRISPR adaptation events. However, since spacer acquisition can be very infrequent, specific selection of adapted cells is required (Figure 1). Examples of such selections include obtaining colonies of BIMs (bacteriophage insensitive mutants) (Figure 1A) [4,21,22,31,32] or PIMs (plasmid interfering mutants) (Figure 1B) [30,31,33]. These methods are cheap and do not require genetic manipulation of cells under study but they are biased towards interference-proficient spacers acquired from MGEs and thus cannot be used to detect spacers that do not lead to interference against MGEs or lead to self-interference due to acquisition of a spacer from cell's own genome (depending on the CRISPR-Cas subtype, when interference is inactivated, such spacers can constitute from 2 to 99% of acquired spacers [23,26,34,35]).

Two powerful experimental systems have been developed to overcome these limitations and increase the sensitivity of the detection of spacer acquisition events [36,37] (Figure 1C,D). Both systems are based on a reporter gene introduced upstream of the leader sequence of a specifically designed miniaturized E. coli CRISPR array. The reporter is transcribed from a promoter located downstream of the array in a direction opposite to the direction of leader-initiated CRISPR array transcription. The resulting mRNA includes a start codon followed by the leader-CRISPR array segment (cloned in reverse orientation) and the sequence of the reporter which does not have a translational start of its own. In cells with unexpanded CRISPR arrays translation of the reporter ORF is prevented due to an in-frame stop codon within the leader. Insertion of an additional 61-bp long unit (33-bp spacer/28-bp repeat) changes the reading frame and allows the synthesis of the reporter leading to either chloramphenicol resistance [36] (Figure 1C) or fluorescence [37] (Figure 1D) of cells that acquired a spacer. Rare chloramphenicol-resistant colonies can be directly screened for CRISPR array expansion by PCR. With the fluorescent protein-based system, live fluorescent microscopy is used to observe and quantify cells that acquired spacers [37]. Though this has not been implemented yet, the use of FACS (fluorescence-activated cell sorting) should allow one to enrich the population of cells with expanded arrays for downstream analysis. With both systems, the acquisition of spacers that carry stop codons located in the reading frame of the reporter remains undetected. Likewise, incorporation of more than one spacer-repeat unit or incorporation of a single non-standard spacer that fails to restore the reporter reading frame will be undetected. Finally, CRISPR-Cas systems where incorporation of a standard spacer-repeat unit does not shift the reading frame (i.e. introduces an insertion whose length is n × 3 bp, where n is an integral number of nucleotides) cannot be studied.

Detection of CRISPR adaptation in cell populations

In early studies of adaptation, the sequences of newly acquired spacers were determined for individual colonies by Sanger sequencing [4,21–23,30–33]. To analyze millions of CRISPR arrays in a single experiment, high-throughput sequencing (HTS) is usually used [38–40]. This allows one to study biases in spacer length, the distribution of corresponding protospacers along different DNA sources and their nucleotide composition [34,35,38–52]. In principle, with sufficient sequencing depth, HTS of total genomic DNA purified from a culture should reveal reads corresponding to expanded arrays [53]. In a model system of E. coli cultures overproducing the Cas1–Cas2 adaptation protein complex and transformed with spacer-sized oligonucleotides, ∼350× genomic coverage allowed to confidently detect CRISPR array expansion that occurred in ∼10% of cells [54]. Moreover, rarer off-target integration events elsewhere in the genome were also detected [54].

While clearly powerful and unbiased, the shotgun sequencing approach requires high sequencing coverage and provides very low (dozens) numbers of reads corresponding to expanded arrays making it unsuitable for studies aimed at qualitative understanding of spacer selection preferences [54]. Therefore, the common strategy is to prepare PCR amplicons of arrays from cultures undergoing CRISPR adaptation and then subject them for HTS [38–40]. Gel-electrophoresis is used to separate amplicons of initial, unexpanded CRISPR arrays (+0) from those that acquired one (+1), two (+2) or more spacer-repeat units (Figure 2A).

The main problem with PCR-based in-culture methods of detection of CRISPR adaptation is their low sensitivity due to more efficient amplification of shorter (and, in most interesting cases, much more abundant) unexpanded CRISPR arrays [55]. In the case of E. coli type I-E system, the standard method allows one to reliably detect expanded arrays amplicons only in cultures which contain, in our experience, at least 5% of adapted cells. Several modifications aimed to increase the sensitivity have been developed. The simplest one relies on amplification with a leader-specific primer and a primer matching a newly acquired spacer whose sequence is known [40]. After calibration to the amount of PCR product amplified from a region outside of CRISPR array and reflecting the total number of DNA molecules in the sample, this method can be used to determine the efficiency of adaptation by qPCR [56]. The obvious drawback of this method is that it requires prior knowledge about acquired spacer(s) and thus cannot be applied to study spacer acquisition in systems with unknown adaptation preferences. However, it is very powerful when studying acquisition from spacer-sized oligonucleotides transformed into cells [51,54,57].

Another modification uses a leader-specific primer and ‘degenerate' primers matching the repeat sequence and containing one extra 3′-end nucleotide (Figure 2B) [42]. The additional position contains, in equal proportions, three nucleotides except for the one complementary to the last nucleotide of the leader. While amplification products are only expected if (i) a spacer has been acquired and (ii) its last nucleotide is different from the last nucleotide of the leader, in practice amplicons from unexpanded arrays are still observed [42]. The method was reported to detect as little as 0.01% of cells with expanded arrays [42]. However, by design, up to ∼25% of acquired spacers remain undetected.

Moreover, the method is effective only when applied to engineered miniaturized CRISPR arrays reduced to just one repeat, since multiple amplification products from unexpanded arrays with multiple repeats are produced which can't be distinguished from amplicons from expanded arrays.

Reamplification of gel-purified amplicons of expanded arrays allows one to increase the sensitivity of detection of CRISPR adaptation in cell cultures [58]. Products of standard amplification are separated by gel-electrophoresis and purified. The reamplification step is repeated until a fragment of expected length becomes clearly visible on the gel (Figure 2C). The use of automated BluePippin system (agarose gel electrophoresis with automated elution for size selection) allows one to improve the quality of separation, reduce contamination from unexpanded arrays amplicons, and increase the reproducibility of analysis. Even when amplicons of expanded arrays are invisible after first electrophoretic separation, DNA extracted from the corresponding position of the gel can be used for reamplification. Depending on the set of primers designed for reamplification (‘internal', ‘degenerate', or ‘repeat-specific'), this method is reported to detect, correspondingly, 1, 0.01, and 0.1% of cells with expanded arrays within E. coli cultures. ‘Internal' primers can be either the same as the ones used during initial amplification, or a leader-specific ‘nested' reamplification primer annealing closer to the array can be used to increase specificity and avoid amplification of non-CRISPR DNA. In either case, amplification of unexpanded arrays co-purified with expanded ones is not suppressed. ‘Degenerate' primers selectively suppress unexpanded array reamplification as described above. Reamplification with ‘repeat-specific' primers relies on the fact that amplicons corresponding to expanded arrays have two repeats after the first PCR step (Figure 2C). Thus, amplification with primers matching the halves of the repeat sequence yields PCR product only for expanded arrays.

The SENECA [59] pipeline selectively amplifies expanded CRISPR arrays. At the heart of the method is the construction of a plasmid-borne CRISPR array with an FaqI endonuclease recognition site immediately downstream of a miniaturized CRISPR ‘array' consisting of a single repeat preceded by the leader sequence (Figure 2D). Unlike most Type II restriction endonucleases, FaqI, a Type II-S enzyme, cleaves DNA outside of its recognition site generating a sticky end. The CRISPR array used in SENECA is designed such that the recognition of the FaqI site leads to cleavage in the upstream repeat. An Illumina adapter with a sticky end complementary to that generated by FaqI is ligated and PCR with a pair of primers, one complementary to repeat and another — to adapter, selectively amplifies expanded arrays, since the initial repeat sequence is lost after the FaqI treatment and thus amplicons from unexpanded arrays are not amplified. While the published SENECA protocol is based on the use of FaqI, other Type II-S restriction endonucleases could conceivably be used in lieu of FaqI.

The complete article is available at the DOI below.

DOI:

 10.1042/BST20190662