Development of 21 novel polynucleotide microsatellite markers in a Clithon retropictum (Gastropoda: Neritidae), using next-generation sequencing
Received: October 14, 2017
Accepted: October 21, 2017
Published: September 13, 2017
Genet.Mol.Res. 16(4): gmr16039806
The neritid species Clithon retropictum (Gastropoda), an oviparous snail, is distributed in freshwater and brackish water in southern Korea, south-east Japan, southern China and Taiwan. This species is listed as a class II endangered species in Korea because of its decreasing population size. It has led to an increase in the need for genetic information about this species. For this aim, in this study, we developed microsatellite markers for C. retropictum by using nextgeneration sequencing. A total of 185,972 sequences containing motifs with a minimum of five repeats motifs were identified from 3,392,119 reads. Of the 46 loci screened among 49 individuals, 36 were successfully amplified and 21 were polymorphic among 49 individuals, with 14 tri-nucleotide repeats and 7 tetra-nucleotide repeats. All loci exhibited relatively high genetic variability except one locus (CR08), with an average of 10.33 alleles per locus, and the mean observed and expected heterozygosities were 0.661 and 0.758, respectively. Our results demonstrated the utility of next-generation sequencing as a method for the rapid and cost-effective identification of microsatellites. These 21 newly developed microsatellite markers will be informative tools for investigating the genetic structure and diversity among populations of this endangered species and will help facilitate effective strategies for its conservation.
Clithon retropictum (Martens, 1879) (Gastropoda: Neritidae) is a small snail species (shell size of 15–25 mm) inhabiting hard substrates, such as stones and concrete blocks, in brackish or freshwater riverbeds in Korea, Japan, southern China and Taiwan (Furujo and Tomiyama, 2000; Ohara and Tomiyama, 2000; Noseworthy et al., 2013). It is among the most long-lived gastropods, with a lifespan of 12 years (Furujo and Tomiyama, 2000; Shigemiya and Kato, 2001). They are an ecosystem indicator species and play an important role in assessing water quality; they act as cleaning agents by ingesting sediments in the water.
It is designated as a class II endangered species by Department of Environment of Korea and also on that of the Marine Organisms under Protection by Ministry of Oceans and Fisheries in Korea (Species Korea, https://www.nibr.go.kr). While the causes of the decline are unknown, habitat loss from coastal development and ecological impacts of environmental change may be contributing factors. Species extinction and biodiversity loss are increasingly becoming a worldwide concern (Wallmo and Lew,2011), fostering a need for a basic understanding of population genetics. Therefore, further assessment and monitoring of genetic diversity in endangered and key indicator species are required.
Among the various currently available molecular markers, microsatellites (MS), also known as simple sequence repeats (SSRs), are very useful for molecular genetic analyses because they have many desirable features, such as ease of use, codominance, and high mutation rates (Sunnucks, 2000). Especially, microsatellite genetic markers have been used extensively to detect genetic diversity and to evaluate population structure in marine organisms, including various gastropods (An et al., 2012; Hyun et al., 2016). MS genetic markers with adequate high levels of polymorphism allow the genetic diversity and structure of various endangered marine species to be easily inferred (An et al., 2013; Lee et al., 2013; Pirog et al., 2016). However, only one molecular genetic analysis has been published for C. retropictum (Park et al., 2016). Therefore, the development of powerful and efficient species-specific markers is necessary to analyze the population genetics of this endangered species. Recently, with the advent of next-generation sequencing (NGS) platforms, development of SSRs in non-model species has grown rapidly (Ekblom and Galindo, 2011). In this study, we developed 21 novel polymorphic microsatellite markers by using the NGS platforms, Illumina MiSeq. The polymorphic microsatellite markers described here will be useful for future genetic studies that will help us to better understand the genetic status of C. retropictum and facilitate the conservation of this endangered species in Korea.
Material and Methods
Sample collection and illumina sequencing
Total 49 samples of C. retropictum were collected from three southern coastal sites in Korea: Goseong (n = 20), Geoje (n = 20), and Changwon (n = 9) in August 2016. The collected samples were immediately preserved in 95% ethyl alcohol at the site until DNA extraction. For microsatellite isolation, high-molecular-weight DNA (≥2 μg) was extracted using the QIAGEN DNeasy Blood & Tissue kit (QIAGEN, Valencia, California, USA) from the musculature tissue of an individual of C. retropictum. A whole-genome shotgun library was constructed with a commercial kit Illumina’s TruSeq Nano DNA kit (Illumina, San Diego, CA, USA) following the manufacturer’s protocol (Paired-End Library Construction). For genotyping to characterize the microsatellite DNA loci, total genomic DNA was extracted from the small amount of musculature tissue of all collected individuals of C. retropictum using the QIAGEN DNeasy Blood & Tissue kit (see above). The extracted genomic DNA was stored at −20°C until further use.
Microsatellite discovery and primer screening
The resulting raw sequences from C. retropictum were assembled into contigs using IDBA-UD assembler v1.1.1 (Peng et al., 2010; Peng et al., 2012). To search out microsatellite markers among assembled contigs (305,591), we used the MISA (Microsatellite identification) software, a SSRs motif scanning tool written in Perl (http://pgrc.ipk-gatersleben.de/misa/). Sequences were screened for longer than 300 bp with a minimum of five repeats of di-, tri- or tetra-nucleotide repeat motifs from the contigs and singletons using the software MSATCOMMANDER 1.08 (Faircloth, 2008). Primers were selected using Primer3 software (Untergasser et al., 2012). The key parameters set for primer design were as follows: (1) optimum annealing temperature 54–60°C with a maximum 1°C difference between paired primers, (2) PCR product size of 150–500 bp, (3) GC content of 45% to 55%, (4) primer length of 18–26 nucleotides, and (5) primers self-complementarities and quality criteria with default parameters.
PCR amplification and genotyping
All of the newly designed PCR primer pairs were tested for consistency in PCR amplification, which was performed on a sample set from four C. retropictum collected from Goseong, Korea. The PCR amplification was performed using a SimpliAmpTM Thermal Cycler (Thermo Fisher Scientific, Waltham, MA, USA) in a 15 μL reaction mixture containing 0.5 U of Ex Taq DNA polymerase (Takara Biomedical, Inc., Shiga, Japan), 1× PCR buffer, 0.2 mM dNTPs, 10 pmol of each primer, and ~30 ng of template DNA. The forward primer from each pair was 5'-end-labeled with 6-FAM, VIC, NED or PET dyes (Applied Biosystems Inc., Foster City, CA, USA). PCR reactions were run for 10 min at 95°C followed by 35 cycles of 20 s at 95°C, 40 s at 59°C, and 1 min at 72°C with a 3 min final extension at 72°C. The PCR amplification was considered to be successful based on the presence of a visible band after running 2 μL of the PCR product on a 1.5% denaturing agarose gel. The Gene Ruler 100 bp plus DNA Ladder molecular weight marker (Thermo Fisher Scientific) was used as a standard to assess the product size. If no amplification was detected, that primer set was excluded from further analysis. For the remaining loci, polymorphism was examined in all collected individuals. To increase the efficiency of the genotyping, fluorescent labeled PCR products with different size of 4 or 5 markers were pooled in a single-track sequencing reaction and microsatellite size variations were identified using an ABI PRISM 3130 xL Genetic Analyzer (Applied Biosystems), and alleles were designated by PCR product size relative to a molecular size marker (GENESCAN 500 LIZ, Applied Biosystems). Fluorescent DNA fragments were analyzed using GENOTYPER, version 4.1 software packages (Applied Biosystems).
The MICROCHECKER version 2.2.3 software (Van Oosterhout et al., 2004) was used to detect the possible genotyping errors due to null alleles, stuttering, or large allele dropout using 1000 randomizations. The genetic diversity of each locus was evaluated using the number of alleles (NA), the observed and expected heterozygosity (Ho and He, respectively), and the polymorphic information content (PIC) with CERVUS version 3.03 (Kalinowski et al., 2007). Linkage disequilibrium to determine the extent of distortion from the independent segregation of loci and Hardy–Weinberg equilibrium (HWE) using the inbreeding coefficients (FIS) (Weir and Cockerham, 1984) were tested with GENEPOP program, version 4.2 (Rousset, 2008). Significance levels were adjusted for multiple tests using the sequential Bonferroni correction (Rice, 1989).
Results and Discussion
Illumina paired-end sequencing
A total of 3,392,119 reads or sequences, consisting of 1,299,622,759 bp (Q30 = 94.07%), were generated from the C. retropictum sample. The raw sequences could be assembled into contigs. This process eliminates the repetitive sequences and creates longer reads, which may increase the probability of detecting microsatellite repeats and suitable primers within a read (Perry, 2011). The high-quality filtered reads were assembled into 305,591 contigs with a length of 146,095,261 bp (average 478 bp), N50 scaffold size of 452 bp, and 43.53% GC content (Table 1).
|Total number of bases||1,299.62 Mb|
|Average read length||383 nt|
|Number of reads||3,392,119|
|Number of contigs||305,591|
|Total contigs||146,095,261 nt|
|Average contig read length||478 nt|
|Max. contig length/Min. contig length||15,969/155 nt|
Table 1: Summary of Illumina MiSeq sequencing.
We obtained a longer average read length compared with previous studies, with average read lengths of 369 bp in Haliotis diversicolor supertexta using 454 GS-FLX pyrosequencing (An et al., 2012). Longer reads increase the likelihood of detecting loci with a greater number of repeats, which are expected to be more polymorphic, as well as the probability of detecting MS repeats and suitable primers within a single read. Therefore, a sufficient depth of genome coverage is needed to develop a more comprehensive MS marker set via de novo sequencing (Farrer et al., 2009).
The number of identified SSRs sequences 1,897,156 with each repeat type were classified as follows: di-, tri-, tetra-, penta-, and hexa- nucleotide repeats of 39.7% (752,664), 26.1% (495,878), 27.5% (520,944), 4.7% (89,975), and 2.0% (37,695), respectively (Figure 1). A total of 185,972 unique sequences containing pure/compound microsatellite regions and primer-designable flanking regions were selected.
Microsatellite loci isolation
Among 185,972, 80 sequences with a minimum of nine tri- or tetra-nucleotide repeat motifs were used to develop MS primers. To design the primers, sequences that were of adequate length (more than 300 bp) and unique sequences flanking the MS array (minimum of 100 bases) were selected. Thus, 46 MS loci (27 tri- and 19 tetra-nucleotides) were selected for subsequent polymorphism screening. Of these 46 MS loci, 36 (20 tri- and 16 tetra-nucleotides) were amplified successfully in the initial evaluation of the MS primers.
The remaining 10 primers did not generate the desired amplification products in all the tested four individuals. Additionally, 13 loci showed faint or inconsistent bands, which may be due to nonspecific PCR amplification. Subsequently, further screening revealed that 21 (58.3%) loci were polymorphic and 2 loci were monomorphic in the four C. retropictum samples. The primer sequences, repeat motifs, fluorescent labels, and GenBank accession numbers for the 21 novel polymorphic microsatellite loci are summarized in Table 2.
|Locus (GenBank accession no.)||Primer sequence (5'-3')||Repeat motif||Size range (bp)||NA||Ho||He||PIC||FIS|
Table 2: Characteristics of the 21 microsatellite loci developed for Clithon retropictum.
For efficient genotyping, PCR products of 4 or 5 loci pooling in a single-track reaction using per product size variation were used. By sequencing in the same track the products of PCR reactions, analysis time and cost was significantly reduced compared with the conventional (no pooling). The introduction of microsatellite markers to population genetic studies has greatly advanced our ability to determine the genetic structure of population, test parentage and relatedness, assess genetic diversity and study recent population history. Compared to traditional cloning-based approaches, NGS is faster and produces a larger, most cost-effective database for microsatellite discovery (Yu et al., 2011).
Of the 21 polymorphic microsatellite loci 14 were tri-nucleotide repeats and 7 were tetra-nucleotide repeats. Until recently, for most marine species for which microsatellite markers have been developed, di-nucleotide repeats, such as (AG)n, are still the predominant markers. However, tri- and tetra-nucleotide repeats are reported to be highly polymorphic, more stable and show clearer than the dinucleotide ones (Lindqvist et al., 1996). The advantages of generating microsatellites by NGS were added in this study.
Forty-nine individuals of C. retropictum were screened for variation at the 21 novel polymorphic microsatellite loci. The statistical results for these microsatellite loci are summarized in Table 2. A homology search using BLAST showed that none of these sequences was similar to any of the sequences in GenBank. The 21 newly developed microsatellite markers of C. retropictum were showed relatively highly polymorphic except one locus (CR08). Understanding the genetic diversity of C. retropictum populations is vital for stock abundance recovery and protection of this endangered species.
In total, 217 alleles were observed for the 21 loci; the number of alleles per locus varied from 2 at CR-08 to 16 at CR-34 and CR-43 (mean 10.33; Table 2). The observed heterozygosity (HO) ranged from 0.043 at CR-08 to 0.911 at CR-34 (mean 0.661), whereas the expected heterozygosity (He) varied from 0.042 at CR-08 to 0.926 at CR-34 (mean 0.758; Table 2). All 21 loci had high polymorphic information content (>0.5), and rare alleles with a frequency <5% were detected at most loci. Such relatively high level of genetic diversity has been reported in other gastropod species (An et al., 2012; Hyun et al., 2016), suggesting that these polymorphic microsatellites were sufficient to reveal the intraspecific diversity of this endangered species. There was no evidence of genotyping errors or allele dropouts due to stuttering, which would have affected the allele scoring. Samples that failed to amplify after the rerun were excluded, and thus the likelihood that poor DNA quality affected the results was low. The MICRO-CHECKER analysis revealed that our data demonstrated seven loci (CR-07, CR-11, CR-20, CR-21, CR-25, CR-41 and CR-43) might be affected by one or more null alleles in the samples tested. Deviation from the Hardy–Weinberg equilibrium (HWE) (p < 0.0024) was evident at four loci (CR-11, CR-20, CR-21 and CR-43) with having null alleles. Generally, heterozygote deficiency could be due to allelic dropout, the limited sample size, size homoplasy, or the presence of null alleles (Olivatti et al., 2011).
In this study, most loci were clearly associated with the presence of null alleles. However, this study was limited by the number of screened samples. In addition, a total of 49 individuals of C. retropictum that were collected and tested were mixture of populations from three different localities that may be genetically heterogeneous; the estimates for genetic diversity, such as the genetic diversity parameters and Handel-Weinberg equilibrium may be explained using data from many individuals from single locality. Therefore, our results should be interpreted with caution.
In conclusion, the Illumina MiSeq sequencing method was applied to develop 21 polymorphic microsatellite markers for C. retropictum that will enable us to examine the current patterns of genetic structure and diversity of this endangered species in Korea. In the near future, the microsatellites described here will be used in studies on population genetics and conservation genetics, and supporting actions for effectively managing C. retropictum.
Conflicts of interest
The authors declare no conflict of interest.
This work was funded by a grant from the National Marine Biodiversity Institute of Korea (MABIK) (grant numbers 2017M01200, 2017M02000).
About the Authors
Hye Suck An
National Marine Biodiversity Institute of Korea, Janghang-eup, Seochun gun, Chungchungnam-do, 33662, Korea
- An HS, Lee JW, Hong SW (2012). Application of novel polymorphic microsatellite loci identified in the Korean Pacific Abalone (Haliotis diversicolor supertexta (Haliotidae)) in the genetic characterization of wild and released populations. Int. J. Mol Sci. 13: 10750-10764.
- An HS, Lee JW, Hong SW (2013). Population genetic structure of the Korean Pacific abalone Haliotis diversicolor supertexta inferred from microsatellite marker analysis. Biochem. Syst. Ecol. 48: 76-84.
- Ekblom R, Galindo J (2011). Applications of next generation sequencing in molecular ecology of non-model organisms. Heredity. 107: 1-15
- Faircloth BC (2008). MSATCMMANDER: detection of microsatellite repeats arrays and automated, locus-specific primer design. Mol. Ecol. Resour. 8: 92-94.
- Farrer RA, Kemen E, Jones JD, Studholme DJ (2009). De novo assembly of the Pseudomonas syringae pv. syringae B728a genome using Illumina/Solexa short sequence reads. FEMS. Microbiol. Lett. 291: 103-111. https://doi.org/10.1111/j.1574-6968.2008.01441.x
- Furojo Y, Tomiyama K (2000). Distribution and microhabitat of coexisting two freshwater snail species, Semisulcospira libertine (Gould, Prosobranchia: Pleuroceridae) and Clithon retropictum (Martens, Prosobranchia: Neritidae). Jpn. J. Malacol. Venus. 59: 245-260.
- Hyun YS, Song HY, Woo J, Oh S, et al. (2016). Isolation and characterization of 20 polynucleotide microsatellite markers in a vulnerable Korean snail, Ellobium chinense, using 454 Pyrosequencing. Genes Genom. 39: 155-160.
- Kalinowski ST, Taper ML, Marshall TC (2007). Revising how the computer program CERVUS accommodates genotyping error increases success in paternity assignment. Mol. Ecol. 16: 1099-1106.
- Lee HJ, Lee DH, Yoon SJ, Kim DH, et al. (2013). Characterization of 20 microsatellite loci by multiplex PCR in swimming crab, Portunus trituberculatus. Genes Genom. 35: 77-85.
- Lindqvist AK, Magnusson PK, Balciuniene J, Wadelius C, et al. (1996). Chromosome-specific panels of tri- and tetranucleotide microsatellite markers for multiplex fluorescent detection and automated genotyping: evaluation of their utility in pathology and forensics. Genome Res. 6 :1170-1176.https://doi.org/10.1101/gr.6.12.1170
- Noseworthy RG, Lee HJ, Choi KS (2013). The occurrence of Clithon retropictum (v. Martens, 1879) (Gastropoda: Neritidae) in an unusual habitat, Northern Jeju Island, Republic of Korea. Ocean Sci. J. 48: 259-262. https://doi.org/10.1007/s12601-013-0023-2
- Ohara T, Tomiyama K (2000). Niche segregation of coexisting two freshwater snail species Semisulcospira libertine (Gould, Prosobranchia: Pleuroceridae) and Clithon retropictum (Martens, Prosobranchia: Neritidae). Jpn J Malacol. 59: 135-147.
- Olivatti AM, Boni TA, Silva-Júnior NJ, Resende LV, et al. (2011). Heterologous amplification and characterization of microsatellite markers in the Neotropical fish. Leporinus friderici. Genet Mol Res. 10: 1403-1408. https://doi.org/10.4238/vol10-3gmr1020
- Park SY, Patnaik BB, Kang SW, Hwang HJ, et al. (2016). Transcriptomic analysis of the endangered neritid species Clithon retropictum: de novo assembly, functional annotation, and marker discovery. Genes. 7: e35.
- Peng Y, Leung HC, Yiu SM, Chin FY (2010). IDBA- a practical iterative de Bruijn graph de novo assembler. Berger B (ed): RECOMB, LNBI 6044. 426-440.
- Peng Y, Leung HC, Yiu SM, Chin FY (2012). IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth. Bioinformatics. 28: 1420-1428.
- Perry JC, Rowe L (2011). Rapid microsatellite development for water striders by next-generation sequencing. J Hered. 102: 125-129.
- Pirog A, Jaquemet S, Blaison A, Soria M, et al. (2016). Isolation and characterization of eight microsatellite loci from Galeocerdo cuvier (tiger shark) and cross-amplification in Carcharhinus leucas, Carcharhinus brevipinna, Carcharhinus plumbeus and Sphyrna lewini. PeerJ. 4: e2041. https://doi.org/10.7717/peerj.2041
- Rice WR (1989). Analyzing tables of statistical tests. Evolution. 43: 223-225.https://doi.org/10.2307/2409177
- Rousset F (2008). Genepop'007: a complete re-implementation of the Gene pop software for Windows and Linux. Mol Ecol Resour. 8: 103-106. 10.1111/j.1471-8286.2007.01931.x
- Shigemiya Y, Kato M (2001). Age distribution, growth, and lifetime copulation frequency of a freshwater snail, Clithon retropictum (Neritidae). Popul Ecol. 43: 133-140.
- Sunnucks P (2000). Efficient genetic markers for population biology. Trends Ecol Evolut. 15: 199-203.
- Untergasser A, Cutcutache I, Koressaar T, et al. (2012). Primer3-new capabilities and interfaces. Nucleic Acids Res. 40: e115.
- Van Oosterhout C, Hutchinson WF, Wills DPM, Shipley P (2004). Micro-Checker: software for identifying and correcting genotyping errors in microsatellite data. Mol Ecol Notes 4: 135. https://doi.org/10.1111/j.1471-8286.2004.00684.x
- Wallmo K, Lew DK (2011). Valuing improvements to threatened and endangered marine species: an application of stated preference choice experiments. J Environ Manage. 92: 1793-1801. https://doi.org/10.1016/j.jenvman.2011.02.012
- Weir BS, Cockerham CC (1984). Estimating F-statistics for the analysis of population structure. Evolution. 38: 1358-1370. https://doi.org/10.1111/j.1558-5646.1984.tb05657.x
- Yu JN, Won C, Jun J, Lim Y, et al. (2011). Fast and cost-effective mining of microsatellite markers using NGS technology: an example of a Korean water deer Hydropotes inermis argyropus. PLoS One. 6: e26933.
- Share This