All submissions of the EM system will be redirected to Online Manuscript Submission System. Authors are requested to submit articles directly to Online Manuscript Submission System of respective journal.
Research Article

The genetic diversity and population structure of Genipa americana L. in Northern Mato Grosso, Brazil

Received: January 14, 2018
Accepted: March 15, 2018
Published: March 27, 2018
Genet.Mol.Res. 17(1): gmr16039899
DOI: 10.4238/gmr16039899


Genipa americana L. (Rubiaceae) is an Amazon region-natural species, which may be subject to inbreeding, since it grows in fragmented environments. The aim of the current study is to assess the genetic diversity and population structure of three G. americana populations naturally grown in Northern Mato Grosso State using SSR markers. The DNA extraction was performed according to the CTAB method, with modifications. Six (6) SSR primers developed for the species were used. The allele frequency, the observed and expected heterozygosity, PIC (Polymorphism information content), and the Fixation Index were estimated. Molecular variance (AMOVA) and principal coordinates (PCoA) analyses were conducted, and the most likely number of groups (k) was inferred using the Structure software in order to help understanding the genetic structure of the populations. The three populations showed genetic diversity, shared genetic material and presented high inbreeding index due to low heterozygosity. The cluster analysis results showed genetic material sharing between populations, as well as lack of genetic structuring among individuals due to their geographical origins. As G. americana is a culturally and economically important native species, in addition to being attractive to wild animals, it is worth preserving forest fragments showing the occurrence of G. americana individuals to assure genetic


Genip tree (Genipa americana L.) is a species belonging to family Rubiaceae; it is natural to the Amazon region and is distributed from Northern Argentina to Mexico (Unctad, 2005). The species is known for its use for different purposes such as construction, cooking (candy and liquor manufacture) and home-made medicine (Lorenzi, 2002; Souza, 2008). It addition, it presents several chemical compounds that make it important to the pharmacological and food industries (Bentes, 2005; Mendes, 2011).

The intensive use of economically important species such as genip tree, as well as the fragmentation of forest areas, may lead to genetic diversity loss due to decreased number of individuals within a given population. Such process boosts genetic variability losses, which lead to genetic drift in the short-term, as well as to increased inbreeding rates in the long-term, as discussed by Kageyama et al. (1998) and Viegas et al. (2011).

According to Ramalho et al. (2016), the genetic structure of populations refers to heterogenous genotype distribution and genetic variability within and between populations. The molecular characterization allows assessing genetic diversity regardless of environmental interferences, as well as inferring the diversity level between individuals and between populations (Costa, 2011).

Molecular markers are a quick and effective instrument in genomic studies, since they directly detect polymorphism at DNA level without environmental influence (Souza, 2001). Thus, polymorphism allows making inferences about the relation between individuals’ genotype and phenotype, fact that ultimately allows increasing the efficiency of breeding programs and species conservation strategies. These markers have been widely used to estimate the genetic diversity of native Amazon species such as Bertholletia excelsa Bonpl (Ramalho et al., 2016), Theobroma grandiflorum (Wild. ex Spreng.) K. Schum (Silva, 2016), Theobroma Speciosum (Varella, 2016) and Spondias mombin L. (Silva, 2017). According to Costa et al. (2011), genetic diversity studies are fundamental, mainly when it comes to native species, whose diversity magnitude is yet to be fully known.

Microsatellites, also known as SSR (Simple Sequence Repeat), have been indicated, among different molecular marker classes, for genetic variability studies, allelic frequency calculations, deviations from Hardy-Weinberg equilibrium, genome genetic and physical mapping, genotype identification and discrimination, paternity tests, as well as for population-genetics studies, since the show codominant nature and because they are one of the most polymorphic molecular marker classes currently available (Ferreira and Grattapaglia, 1998). Thus, the aim of the present study is to assess the genetic diversity and population structure of three G. americana populations naturally grown in Northern Mato Grosso State using SSR markers.

Material and Methods

Study site and material collection

The collections were carried out in Alta Floresta (AFL), Matupá (MTP), and Nova Bandeirantes (NBD) Counties, Northern Mato Grosso State, Brazil (Figure 1). The climate in the region is hot and humid, the mean annual temperature is above 24°C, the mean annual rainfall is above 2,400 mm, and the region presents well-defined dry season (3-5 months). The vegetation types in the region are transitional between Ombrophylous Forest, Seasonal Forest and Savanna (IBGE, 2017; Zappi, 2011).


Figure 1: Geographic location of the three G. americana populations studied in Alta Floresta (AFL), Matupá (MTP) and Nova Bandeirantes (NBD) counties, Mato Grosso State, Brazil.

Sixty-four (64) individuals were sampled among the three populations: 20 in AFL, 20 in MTP, and 24 in NBA. All individuals were found at the edges of the forest fragments where the herein studied populations grow; the minimum distance between them was 50 m. Each individual was tagged and mapped through GPS. The collected material was conserved in silica gel during its transportation to the Laboratory of Plant Genetics and Molecular Biology at Alta Floresta University Campus – UNEMAT. It was later stored in freezer (-20°C) until DNA extraction.

DNA extraction and quantification

The total genomic DNA was extracted from approximately 100 mg of leaf material, according to the CTAB (Cetyl Trimethyl Ammonium Bromide) method described by Doyle and Doyle (1987), with modifications, namely: increased concentration of polyvinylpyrrolidone (PVP) from 1% to 2%; CTAB, from 2% to 5%; and βmercaptoethanol, from 0.2% to 2% in the extraction buffer; and incubation time reduction from 60 to 30 minutes (at 65°C). The DNA quality and quantification used an electrophoresis technique wherein 1% agarose gel was stained with ethidium bromide (0.2 mg/mL) and nanodrop. The quantified DNA was diluted to generate the working solutions at 7.5 ng/μL (Manoel et al., 2014).

The amplification and genotyping of SSR loci

Seventeen (17) SSR (Simple Sequence-Repeat) primers developed for G. americana L. were tested (Manoel et al., 2014) and six of them were selected for analysis (Table 1). The PCR amplification reactions were performed in a Biocycler thermocycler with final volume 12 μL, wherein: 2 μL DNA (7.5 ng/μL); 2.4 μl 5x buffer (colorless GoTaq); 0.6 μl MgCl2 (50 mM); 2 μL of each primer (2 μM); 2 μl dNTP (1 mM); 0.15 μl Taq polymerase (5 U/μl); and 0.85 μl autoclaved distilled H2O. The amplification program used in the current study was the one proposed by Manoel et al. (2014), wherein: 1 initial denaturation cycle at 95°C for 5 minutes, followed by 30 cycles of 45 seconds at 95°C; 1 minute at 53-62°C (depending on the primer); 1 minute at 72°C; and 1 final extension cycle at 72°C for 7 minutes. The amplification products were separated through horizontal electrophoresis using 3% agarose gel in TBE 1X buffer at constant voltage 80 V. The gels were stained with ethidium bromide (0.6 ng/μL) for 20 minutes after electrophoresis; then, they were visualized in a UV transilluminator and photo documented. The sizes of the amplified fragments were estimated through their comparison with the molecular marker 100 bp DNA Ladder (Kasvi).

Loci Class Sequence Expected size

Table 1. Primers used in the genetic diversity analysis of G. americana individuals; respective annealing temperatures and expected size of the amplified fragments (ESAF).

Data analysis

The SSR fragments were analyzed in the GelQuantPro® software, DNR, 2006 in order to help building a matrix based on the size of the amplified fragments. The Power Marker V.3.25 software (Liu and Muse, 2005) was used to set the allele frequency, the observed and expected heterozygosity, PIC (Polymorphism information content) and the fixation index (F). The matrix was analyzed in the GenAlEx 6.5® software (Peakall and Smouse, 2006) in order to enable the principal coordinate (PCA) and molecular variance analyses (AMOVA). The matrix of genetic distance values by Nei et al. (1983) was generated in the Power Marker V.3.25 software and imported to the MEGA 6.5 software (Kumar et al., 2004) to build the dendrogram, according to the UPGMA method. The Structure software (Pritchard at al., 2000), which is based on Bayesian statistics, was used to infer the number of groups (k). Twenty (20) runs were performed for each K value, as well as 200,000 burn-ins and 500,000 Markov Chain Monte Carlo (MCMC) simulations. The criteria by Pritchard and Wen (2004) and the criterion by Evano et al. (2005) were used to define the most likely k in comparison to the proposed ones and the results were recorded on the Structure Harvester website.


The 6 microsatellite loci used in the genotyping of 64 G. americana individuals showed 17 alleles, in total, ranging from 2 to 4 alleles per locus, with mean alleles 2.83 (Table 2). The largest number of alleles was found in the locus ‘Gam 02’ (4 alleles). The polymorphic information content (PIC) ranged from 0.29 to 0.60 (mean 0.45). The observed heterozygosity showed mean value 0.17 and maximum value 0.53 in Gam 02. The expected heterozygosity ranged from 0.35 to 0.67 and remained higher than the observed heterozygosity in all herein analyzed loci. This relation between observed and expected heterozygosity resulted in positive inbreeding coefficients (f) for all loci, with overall mean f = 0.40. It indicates that the occurrence of homozygotes in the populations was higher than the expected, according to the Hardy-Weinberg equilibrium model.

Primer Na Dg He Ho PIC
Gam01 3 0.87 0.61 0.27 0.54
Gam02 4 0.92 0.67 0.53 0.60
Gam03 2 0.84 0.35 0 0.29
Gam06 2 0.80 0.48 0 0.36
Gam11 3 0.70 0.57 0.13 0.49
Gam36 3 0.64 0.50 0.10 0.44
Total 17 - - - -
Mean 2.83 0.80 0.53 0.17 0.45

Table 2. Number of alleles (Na), gene diversity (Dg), expected heterozygosity (He), observed heterozygosity (Ho) and Polymorphic Information Content (PIC) of six SSR primers, based on the amplification of 64 G. americana individuals belonging to the three populations sampled in Northern Mato Grosso State, Brazil.

The three herein studied populations showed gene diversity; the MTP population showed the highest observed heterozygosity and the lowest fixation index when it was compared to the other two populations (Table 3). However, all populations showed He higher than Ho.

Populations Dg He Ho F PIC
AFL 0.87 0.48 0.14 0.72 0.40
MTP 0.83 0.51 0.21 0.60 0.42
NBD 0.87 0.57 0.13 0.78 0.50

Table 3. Genetic diversity among the three studied G. americana populations according to the gene diversity index by Nei (H). Genetic diversity (Dg), expected heterozygosity (He), observed heterozygosity (Ho), fixation index (f) and Polymorphic Information Content (PIC).

The molecular variance analysis (AMOVA) indicated 13% of the total variance as the difference between populations, and 87% of it as the difference within populations. It showed that the greatest genetic difference lies on the intrapopulation component rather than on the interpopulation one, i.e., the greatest genetic variation resulted from differences between individuals (Table 4).

Variation sources DF* SQ* VC* TV (%)* FST
Between populations 2 26.81 0.273 13 0.13
Within populations 63 223.10 1.785 87  
Total 64 249.91 2.058    

Table 4. Molecular variance analysis (AMOVA) of the three natural G. americana populations found in Northern Mato Grosso State and studied according to six SSR markers.

The population structure based on the Bayesian analysis performed in the Structure software, according to the ΔK method described by Evanno et al. (2005), allowed identifying 5 divergent groups, which contributed to the genetic composition of the three G. americana populations analyzed in the current study (AFL, MTP and NBD). The ΔK was estimated for K and it ranged from 1 to 6; the highest ΔK value was K = 5 (Figure 2); the 5 genetic groups were represented in red, green, blue, yellow and pink.


Figure 2: Distribution of 64 G. americana individuals in groups, according to molecular data of 6 SSR loci in the Structure software. The individuals are represented by vertical bars; the colors were attributed according to the group formed in the Structure software (five groups, K = 5). 1 (1-20): AFL; 2(21-40): MTP; 3(41-64): NBD.

Original AFL (1), MTP (2) and NDB (3) G. americana populations share individuals belonging to different genetic groups formed in the analysis conducted in the Structure software (Figure 2). It indicated the relation between the geographical origin of the individuals belonging to the original populations and the genetic groups identified through the Structure software. The graphical visualization of the population structure (Figure 2) allowed identifying several individuals carrying the genetic mix of more than two genetic groups.

The analysis of the graph resulting from the PCoA (Figure 3) corroborated the result of the Structure software. Thus, it showed that the three original populations share genetic material among themselves and, consequently, the individuals are not grouped according to their geographical origins.


Figure 3: Graphical dispersion based on the principal coordinates analysis conducted in 64 G. americana individuals from three AFL, MTP and NBD populations sampled in Northern Mato Grosso State.


The loci analyzed in the current study showed 17 alleles in 64 individuals (mean 2.83). This number of alleles is lower than that found by Manoel et al. (2014), who analyzed two genip tree populations in Mato Grosso do Sul (19 individuals) and São Paulo (21 individuals) states and found 33 alleles (mean 5.50) in the first population and 27 (mean 4.50) in the second one. The polymorphic information content (PIC) of the SSR markers was 0.45, on average. This value was higher than that found by Silva et al. (2014) in the same species (0.21) in the ISSR (Inter-Simple Sequences Repeat) molecular marker analysis. According to Botstein et al. (1980), markers showing PIC values between 0.25 and 0.50 are fairly informative for the species under analysis, as it was observed in the SSRs applied to G. americana in the current study.

The observed heterozygosity in the three G. americana populations was lower than the expected heterozygosity. Similar results have been found in several studies conducted with tropical species (Silva et al., 2017; Dardengo et al., 2016; Varella et al. 2016). According to Alves et al. (2007), the number of alleles is correlated with the expected heterozygosity value. The more alleles in the populations, the larger the number of heterozygotes among individuals, according to the ratios provided by the Hard-Weinberg Law (Griffiths et al., 2013). The values found in the current study showed that the herein investigated G. americana populations are not in Hardy-Weinberg equilibrium, as it was indicated by the mean fixation index.

The mean fixation index of the herein studied populations was 0.40; it was positive and high in genip tree individuals, fact that indicates inbreeding within populations. However, the high fixation index is not standard in the G. americana species, according to Sebben et al. (1998) (mean -0.07) and Manoel et al. (2014) (means -0.04 and -0.09). According to Kageyama et al. (2003), the fixation index (f) is one of the most important population-genetics parameters used to measure the equilibrium between homozygotes and heterozygotes; species showing more homozygosis than heterozygosis may result from the founding effect.

According to AMOVA, most of the genetic diversity lies within G. americana populations. The group generated through the PCoA method, as well as the distribution in 5 groups generated in the Structure software, did not correspond to the individuals’ geographical distribution. It showed that the three original populations share genetic material and that the genetic diversity is not geographically structured in the individuals belonging to these populations. Studies about Amazon species such as Maurita flexuosa, Theobroma speciosum and Spondias mombin showed that the genetic diversity was geographically structured (Silva et al., 2017; Varella et al., 2016; Rossi et al., 2014), unlike what was observed for the current study.

The G. americana individuals were found in fragmented areas, and it may have influenced the genetic diversity patterns and the fixation indices of the studied populations. Disturbed landscapes, besides influencing the demography of plant species, also influence the density of pollinators34 and seed dispersers. Fragmented populations may show increased inbreeding rates due to the higher probability of self-fertilization, as well as of breeding between related individuals, as consequence of the reduced number of individuals (Carvalho et al., 2010). According to Gaino et al. (2010), isolated populations tend to reduce their diversity and size, fact that reinforces the need to preserve plant species.


The genotypes of the herein investigated populations show genetic diversity, which is higher within- than between populations. The three populations showed genetic diversity, shared genetic material, and presented high inbreeding index due to low heterozygosity. Such response may be related to the fragmented environment where these populations currently live in, since it reduces the number of G. americana individuals and pollinators. As it is a culturally and economically important native species, in addition to being attractive to wild animals, it is worth preserving forest fragments showing the occurrence of G. americana individuals to assure genetic diversity, since the strongest diversity lies on the intrapopulation component.

Conflicts of interest

The authors declare no conflict of interest.


The authors wish to thank FAPEMAT (Projeto: Conservação e uso de espécies vegetais nativas da região Amazônica com potencial econômico para a região Norte do Estado de Mato Grosso; (process #166159/2014). D.A.C. Ruzza received a postgraduate scholarship from FAPEMAT.

About the Authors

Corresponding Author

A.A.B. Rossi

University of the State of Mato Grosso, Alta Floresta Campus, Alta Floresta, MT, Brazil



  • Alves RM, Sebbenn AM, Artero AS, Clement C, et al. (2007). High levels of genetic divergence and inbreeding in populations of cupuassu (Theobroma grandiflorum). Tree Genetics & Genomes. 3: 289-298.
  • Bentes AS, Souza H, Amaya-Farfan J, Lopes AS, et al. (2015). Influence of the composition of unripe genipap (Genipa americana L.) fruit on the formation of blue pigment. J Food Sci Tech. 52 (6): 3919-3924.
  • Botstein D, White RL, Skolnick M, Davis RW (1980). Construction of a genetic linkage map in man using restriction fragment length polymorphisms. Am J Hum Genet. 32: 314-331.
  • Carvalho ACMD, Freitas MLM, Moraes SMBD, Moraes MLTD, et al. (2010). Diversidade genética, endogamia e fluxo gênico em pequena população fragmentada de Copaifera langsdorffii. Rev Bras Bot. 33: 599-606.
  • Costa TS, Silva AVC, Silva LÉDO A, Santos ARF, et al. (2011). Diversidade genética de acessos do banco de germoplasma de mangaba em Sergipe. Pesqui Agropecu Bras. 46: 499-508.
  • Dardengo JFE, Rossi AAB, Silva BM, Silva IV, et al. (2016). Diversity and spatial genetic structure of a natural population of Theobroma speciosum (Malvaceae) in the Brazilian Amazon. Rev Biol Trop. 64: 1091-99.
  • Doyle JJ and Doyle JL (1987). A rapid DNA isolation procedure for small amounts of fresh leaf tissue. Phytochemical Bulletin. 19: 11-15.
  • Evanno G, Regnaut S, Goudet J (2005). Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study. Mol Ecol. 14: 2611-20.
  • Ferreira ME and Grattapaglia D (1998). Introdução ao uso de marcadores moleculares em análise genética. Brasília: EMBRAPA-CENARGEM.
  • Gaino AP, Silva AM, Moraes MA, Alves PF, et al. (2010). Understanding the effects of isolation on seed and pollen flow, spatial genetic structure and effective population size of the dioecious tropical tree species. Myracrodruon urundeuva. Conserv Genet. 11: 1631-43.
  • Griffiths AJF, Lewontin RC, Carroll SB, Wessler SR (2013). Introdução à Genética. 10th edn. Guanabara Koogan, Rio de Janeiro.
  • IBGE (Instituto Brasileiro de Geografia e Estatística). Mapas – vegetação.
  • Kageyama PY, Gandara FB, Souza LMI (1998). Conseqüências genéticas da fragmentação sobre populações de espécies arbóreas. Série Técnica IPEF. 12: 65-70.
  • Kageyama PY, Sebben AM, Ribas LA, Gandara FB (2003). Diversidade genética em espécies arbóreas tropicais de diferentes estágios sussecionais por marcadores genéticos. Scientia Forestalis. 64: 93-107.
  • Kumar S, Tamura K, Nei M (2004). Mega: Integrated software for molecular evolutionary genetics analysis and sequence alignment. Brief Bioinform. 5: 150-163.
  • Liu K and Muse SV (2005). PowerMarker: an integrated analysis environment for genetic marker analysis. Bioinformatics. 21: 2128-29.
  • Lorenzi H and Matos FJA (2002). Plantas Medicinais no Brasil/ Nativas e exóticas. Nova Odessa: Instituto Plantarum.
  • Manoel RO, Freitas MLM, Barreto MA, Moraes MLT (2014). Development and characterization of 32 microsatellite loci in Genipa americana (Rubiaceae). APPS. 2: 1-3.
  • Mendes AA, Oliveira PC, Castro HF, Giordano RLC (2011). Aplicação de quitosana como suporte para a imobilização de enzimas de interesse industrial. Quím Nova. 34: 831-840.
  • Nei M, Tajima F, Tateno Y (1983). Accuracy of estimated phylogenetic trees from molecular data. II. Gene frequency data. J Mol Evol. 19: 153-170.
  • Peakall R and Smouse PE (2006). GenAlEx 6: Genetic analysis in Excel. Population genetic software for teaching and research. Mol Ecol. 6: 288-295.
  • Pritchard JK and Wen W (2004). Documentation for structure software: Version 2.1. Available from: accessed December 10, 2016.
  • Pritchard JK, Stephens M, Donnelly P (2000). Inference of population structure using multilocus genotype data. Genetics. 155: 945-959.
  • Ramalho AB, Rossi AAB, Dardengo JFE, Zortéa KÉM, et al. (2016). Diversidade genética entre genótipos de Bertholletia excelsa por meio de marcadores moleculares ISSR. Rev. Floresta. 46: 207-214.
  • Rossi FS, Rossi AAB, Dardengo JFE, Brauwers LR, et al. (2014). Diversidade genética em populações naturais de Mauritia flexuosa L. f. (Arecaceae) com uso de marcadores ISSR. Scientia Forestalis. 42: 631-639.
  • Sebbenn AM, Kageyama PY, Vencovsky R (1998). Variabilidade genética, sistema reprodutivo e estrutura genética especial em Genipa americana L. através de marcadores isoenzimáticos. Scientia Forestalis. 53: 15-30.
  • Silva AVC, Freire KCS, Lédo AS, Rabbani ARC (2014). Diversity and genetic structure of jenipapo (Genipa americana L.) Brazilian. Sci Agric. 71: 345-355.
  • Silva BM, Rossi AAB, Tiago AV, Schmitt KFM, et al. (2017). Genetic diversity of Cajazeira (Spondias mombin L.) in three geographic regions. Genet Mol Res. 16: 1-11.
  • Silva BMD, Rossi AAB, Dardengo JDFE, Araujo VAACD, et al. (2016). Genetic diversity estimated using inter-simple sequence repeat markers in commercial crops of cupuassu tree. Cienc Rural. 46: 108-113.
  • Souza AP (2001). Biologia molecular aplicada ao melhoramento. In: Recursos genéticos e melhoramento de plantas (Nass LL, Valois ACC, Mello IS, Valadares-Inglis MC. Ed.). Rondonópolis: Fundação. 939-965.
  • Souza VC, and Lorenzi H (2008). Botânica Sistemática - Guia ilustrado para a identificação das famílias de fanerógamas nativas e exóticas no Brasil, baseado em APG II. Nova Odessa, Plantarum.
  • UNCTAD (United Nations Conference on Trade and Development) (2005). Marketbriefinthe European Union for selected natural ingredients derived from native species: Genipa americana. Conference on Trade and Development/BioTrade Facilitation Programme, United Nations.
  • Varella TL, Rossi AAB, Dardengo JFE, Silveira GF, et al. (2016). Effect of fragmentation on the natural genetic diversity of Theobroma speciosum Willd. ex Spreng. Populations. Genet Mol Res. 15: 1-10.
  • Viegas MP, Silva CLSP, Moreira JP, Cardin LT, et al. (2011). Diversidade genética e tamanho efetivo de duas populações de Myracrodruon urundeuva Fr. All., sob conservação ex situ. Rev Árvore. 769-779.
  • Zappi DC, Sasaki D, Milliken, W, Iva J, et al. (2011). Plantas vasculares da região do Parque Estadual Cristalino, norte de Mato Grosso, Brasil. Acta Amaz. 41: 29-38.

Full PDF