Molecular Phylogenetics and Evolution 39 (2006) 124–134 www.elsevier.com/locate/ympev

Generating single-copy nuclear gene data for a recent adaptive radiation Justen B. Whittall a,1, Andrew Medina-Marino b,2, Elizabeth A. Zimmer b, Scott A. Hodges a,¤ a

b

Department of Ecology, Evolution and Marine Biology, University of California Santa Barbara, Santa Barbara, CA 93106, USA Department of Botany and Laboratories of Analytical Biology, National Museum of Natural History, Smithsonian Institution Museum Support Center, MRC 534, 4210 Silver Hill Rd, Suitland, MD 20746, USA Received 9 June 2005; revised 3 October 2005; accepted 6 October 2005 Available online 28 November 2005

Abstract Recent adaptive radiations provide an exceptional opportunity to understand the processes of speciation and adaptation. However, reconstructing the phylogenetic history of recent and rapidly evolving clades often requires the use of multiple, independent gene genealogies. Nuclear introns are an obvious source of the necessary data but their use is often limited because degenerate primers can amplify paralogous loci. To identify PCR primers for a large number of loci in an especially rapid adaptive radiation, that of the Xowering plant genus Aquilegia (Ranunculaceae), we developed an eYcient method for amplifying multiple single-copy nuclear loci by sequencing a modest number of clones from a cDNA library and designing PCR primers; with one primer anchored in the 3⬘ untranslated region (3⬘UTR) and one primer in the coding region of each gene. Variation between paralogous loci evolves more quickly in 3⬘-UTR regions compared to adjacent exons, and therefore we achieved high speciWcity for isolating orthologous loci. Furthermore, we were able to identify genes containing large introns by amplifying genes from genomic DNA and comparing the PCR product size to that predicted from their cDNA sequence. In Aquilegia eight out of eleven loci were isolated with this method and six of these loci had introns. Among four genes sequenced for samples spanning the phylogenetic breadth of the genus, we found sequence variation at levels similar to that observed in ITS, further supporting the recent and rapid radiation in Aquilegia. We assessed the orthology of ampliWcation products by phylogenetic congruence among loci, the presence of two well established phylogenetic relationships, and similarity among loci for levels of sequence variation. Higher levels of variation among samples for one locus suggest possible paralogy. Overall, this method provides an eYcient means of isolating predominantly single-copy loci from both low and high-copy gene families, providing ample nuclear variation for reconstructing species-level phylogenies in non-model taxa. © 2005 Elsevier Inc. All rights reserved. Keywords: Nuclear intron; 3⬘-UTR; Aquilegia; Orthology; Gene genealogy

1. Introduction The answers to many fundamental questions in evolutionary biology require the accurate reconstruction of species-level phylogenies. To understand the processes of speciation, adaptation, and coevolution, detailed interspeciWc molecular phylogenies are often necessary (Avise,

*

Corresponding author. Fax: +1 805 893 4724. E-mail address: [email protected] (S.A. Hodges). 1 Present address: Section of Evolution and Ecology, One Shields Avenue, Davis, CA 95616, USA. 2 Present address: Department of Biology, California Institute of Technology, Pasadena, CA 91125, USA. 1055-7903/$ - see front matter © 2005 Elsevier Inc. All rights reserved. doi:10.1016/j.ympev.2005.10.010

1994; Harvey, 1996; Page, 2002; Schluter, 2000). In particular, understanding the process of speciation requires wellresolved phylogenetic reconstructions at and below the species-level to identify the taxa and populations that have recently undergone speciation or are currently diverging (Bradshaw et al., 1995; Schluter, 2000). Unfortunately, as taxa become more closely related (and therefore better models for the study of speciation), their phylogenetic histories become more diYcult to reconstruct because of the lack of variation necessary to retrace their evolutionary relationships. Phylogenetic studies of species that have undergone adaptive radiations face this predicament due to their potentially rapid diversiWcation and the resulting high similarity

J.B. Whittall et al. / Molecular Phylogenetics and Evolution 39 (2006) 124–134

among closely related taxa and yet they are inviting systems in which to identify the evolutionary forces driving speciation (Givnish and Sytsma, 1997; Schluter, 2000). Some adaptive radiations can occur so quickly that species may not accumulate the neutral DNA variation in single gene regions necessary for phylogeny reconstruction. For instance, it has been estimated that over 1000 species of Lake Malawai cichlid Wsh have evolved in less than 50,000 years (KornWeld and Smith, 2000). No single gene region, not even the rapidly evolving mtDNA, will accurately capture the phylogeny of all these species (Seehausen et al., 2003). Similarly, in the Xowering plant genus Aquilegia (Ranunculaceae) speciation has been so rapid that gene regions commonly used for species-level phylogenetic reconstructions such as the nuclear ribosomal ITS (ITS1, 5.8S, and ITS2) and chloroplast intergenic regions fail to provide a fully resolved phylogeny (Hodges and Arnold, 1994). In these and other instances of rapid species diversiWcations, sequences from especially large DNA regions or numerous independent gene regions will be required to obtain suYcient numbers of characters for phylogeny reconstruction (Beltran et al., 2002; DeBry and Seshadri, 2001; Sang, 2002). Because lineage sorting may be much more prevalent when speciation is rapid, multiple independent loci will likely provide the most robust data for phylogeny reconstruction of lineages that have undergone rapid speciation (Barker et al., 2005; Beltran et al., 2002; Cronn et al., 2002; Schaal and Olsen, 2000; Shedlock et al., 2004). Nuclear introns are now being used as additional sources of variation for both animal and plant species-level phylogenies since they harbor similar levels of sequence diversity as many plastid and ribosomal spacer regions and are abundant throughout the genome (Helbig et al., 2005; Howarth and Baum, 2005; Sang, 2002; Steiner et al., 2005). InterspeciWc comparisons of DNA sequences from plant nuclear introns have revealed similar or higher levels of variation than is found in similarly sized non-coding cpDNA regions and/or nuclear ribosomal spacers (reviewed in Sang, 2002). In many cases, nuclear intron based phylogenies have substantially improved resolution compared to studies based on non-coding cpDNA regions and/or nuclear ribosomal spacers (Sang, 2002). In animals, introns were initially considered to be too slowly evolving, susceptible to incomplete lineage sorting, and overly shuZed due to recombination and gene conversion for resolving interspeciWc phylogenies when compared to mtDNA (Allen and Omland, 2003; DeBry and Seshadri, 2001; Johnson and Clayton, 2000; Palumbi et al., 2001). Yet, several studies have successfully used nuclear introns for resolving specieslevel phylogenies in a diversity of animal groups (Beltran et al., 2002; Driskell and Christidis, 2004; Kupfermann et al., 1999; Lavoue et al., 2003; Peters et al., 2005; Weibel and Moore, 2002). Although there are many beneWts for employing nuclear intron sequence data for phylogeny reconstruction, they remain a relatively unexploited resource because of the diYculties in isolating orthologous loci (Doyle et al., 2003).

125

A signiWcant limitation of using nuclear introns to resolve interspeciWc phylogenies is the diYculty in identifying and isolating single-copy loci in non-model systems. It is essential to compare orthologous (i.e., homologous) rather than paralogous genes for phylogeny reconstruction. The potential to confuse the two is a major concern when working with nuclear gene regions (Doyle et al., 2003). Whereas paralogous comparisons document the history of gene duplications, only orthologous comparisons have the potential to reconstruct species relationships (Sang, 2002). Since taxon-speciWc primers are rarely available for nuclear introns, the most widely used method for amplifying these regions employs degenerate primers designed from conserved amino acid positions across diverse model taxa. Degenerate primers cannot distinguish between orthologues and paralogs since they rely on conserved amino acids, which also allow them to amplify duplicated loci (Palumbi and Baker, 1994; Sang, 2002; Strand et al., 1997). When this occurs, substantial additional eVort is required to isolate orthologous loci (Sang, 2002). By designing degenerate primers for genes that have been identiWed as single-copy in model systems, ampliWcation of paralogous loci can be minimized. However, ampliWcation of paralogous loci may still be problematic if gene duplication has occurred since the divergence between the target taxa and the most closely related model taxon (Doyle et al., 2003). Furthermore, the number of single-copy genes is limiting as many genes belong to gene families (Sappl et al., 2004; Wortman et al., 2003). An alternative to using degenerate primers or targeting single-copy loci is to design speciWc primers for orthologous nuclear loci regardless of whether they are single copy or members of a gene family. Because the 3⬘-untranslated region (3⬘-UTR) of genes evolves quickly, we hypothesized that primers anchored in this region would be speciWc for orthologous loci in closely related taxa. The 3⬘-UTR is a non-coding transcribed region between the stop codon and the transcription termination site found in all mRNAs and therefore cDNAs. The average length of the 3⬘-UTR in fungi and plants is approximately 200 bp and in animals it is commonly over 400 bp (Mazumder et al., 2003). This length is suYcient to provide numerous potential priming sites and additional source of sequence variation (Li, 1997). These regions are more variable than the coding regions due to relaxed evolutionary constraints, yet the 3⬘-UTRs are often less variable than introns since they are involved in regulating transcriptional termination and transcript stability (Mazumder et al., 2003; Williams et al., 1999; Wolfe and dePamphilis, 1997). We predict that the higher variation in the 3⬘-UTR compared to exons will increase primer speciWcity and preferentially amplify orthologous loci, even from large gene families. Here, we test our novel 3⬘-UTR anchored primer method to isolate nuclear gene regions variable enough to resolve the interspeciWc relationships in the Aquilegia adaptive radiation. Aquilegia consists of approximately 75 species restricted to the temperate regions of the Northern Hemisphere (Munz, 1946; Whittemore, 1997). An increased diversiWca-

126

J.B. Whittall et al. / Molecular Phylogenetics and Evolution 39 (2006) 124–134

tion rate in the genus has been correlated with the evolution of the nectar spur (Hodges and Arnold, 1995). This recent radiation represents a spectacular diversity of pollination syndromes and habitat types and therefore provides a system that is well-suited for investigating the driving forces of ecological speciation (Rundle and Nosil, 2005; Schluter, 2000). Unfortunately, attempts at resolving the phylogenetic relationships among Aquilegia species with non-coding plastid regions (atpB-rbcL) and nuclear ribosomal spacers (ITS) have revealed surprisingly low sequence variation (Hodges and Arnold, 1994). Currently, there are only two accepted phylogenetic inferences in the genus: (1) the North American Aquilegia species are monophyletic; and (2) Aquilegia is the sister genus to Semiaquilegia (Hodges and Arnold, 1995). Using data from four nuclear introns isolated with the 3⬘UTR technique, we examined orthology through phylogenetic congruence with the two accepted relationships mentioned previously and with consistency of the expected levels of sequence variation between the taxa.

Uni-Zap XR system (Stratagene). The cDNAs were unidirectionally inserted into the vector allowing clones to be sequenced from the 5⬘-end providing partial exon sequence and partial to full 3⬘-UTR sequence. The cDNAs were sequenced on an ABI system at the Laboratories of Analytical Biology, Smithsonian Institution. The cDNA sequences were identiWed by coding region homology to Genbank accessions with translated blast searches (tblastx) in 2003 (Genbank Release 135) (Altschul et al., 1997). We noted the E-score for the best blast hit for each cDNA. We used Escores <0.001 as a cut-oV for identifying the likely gene underlying each cDNA. At this value and below, E-scores approximate p-values and since we were searching relatively few genes the likelihood of false positives is low. We also noted the percent of amino acid matches (Identities) and the length of these matches. We then used sequences with substantially higher matches for designing PCR primers. We did not use any genes with tblastx similarities to chloroplast or mitochondrial genes to focus our analysis on nuclear genes.

2. Materials and methods

2.2. Primer design and optimization

2.1. cDNA library

Eight primers were designed from selected cDNAs with high homology to known proteins and partial to complete 3⬘-UTRs. Primers for three additional loci were designed from MADS-box cDNAs isolated and sequenced by Kramer et al. (2003) that included complete 3⬘-UTR regions (Table 1). For all eleven loci, one primer was anchored in the 3⬘-UTR and a second primer in the coding region (Fig. 1, Table 2). The size of ampliWcation products (if no introns were present) was predicted from the cDNA sequences and ranged from 332 to 550 bp (Table 3). The amount of 3⬘-UTR sequence was maximized when possible to explore its potential variation. Primers were optimized to produce single ampliWcation products in Aquilegia species (and in Semiaquilegia adoxoides, when it ampliWed). AmpliWcations of Enemion occidentalis and Thalictrum fendleri (Table 1), species from two related genera of the same subfamily Thalictroideae (Hodges and Arnold, 1995; Ro et al., 1997), were included to assess the taxonomic breadth of ampliWcation for each locus. These two taxa were tested at the optimized annealing temperature for Aquilegia species (61 °C) and at an annealing temperature of 5 °C lower. All ampliWcations were carried out in 25 L

A cDNA library was constructed by Stratagene using inXorescence tissue of Aquilegia formosa (Table 1) and the Table 1 Accessions included in this study, their geographic regions (NA D North America; EA D Eurasian) and localities Taxon

Region Locality

Aquilegia formosa

NA

A. formosa

NA

A. chrysantha A. brevistyla

NA NA

A. pyrenaica EA A. olympica EA Semiaquilegia adoxoides EA Enemion occidentalis

NA

Thalictrum fendleri

NA

Santa Barbara Botanical Garden, California, USA Bass Lake, west slope Sierra Nevada Mtns., California, USA Ash Canyon, New Mexico, USA Seebe Canyon, Kananaskis, Alberta, Canada 1400 m, Pyrenees Mtns., France West Caucasus Mtns., Czech Republic Neixiang Baotianman Nature Reserve, Henan Province, China Manzana Creek, Figuroa Mountain, California, USA Wasatch Mountains, Salt Lake City, Utah, USA

Fig. 1. The 3⬘-UTR anchored primer technique uses cDNA sequences as templates from which primers are designed in the 3⬘-UTR (P1) and in the exon (P2). When genomic DNA is ampliWed with these primers, any diVerences between the expected length, based on the cDNA sequence, and the observed genomic DNA PCR product length is intron sequence.

J.B. Whittall et al. / Molecular Phylogenetics and Evolution 39 (2006) 124–134

127

Table 2 Primers used in this study Locus

Genbank E-value

Forward and reverse primers (5⬘ to 3⬘)

cDNA GenBank Accession No.

Glycosyl transferase (3GT)

6e¡35

DQ286959

Glyceraldehyde-3-phosphate dehydrogenase (G3PDH)

2e¡71

GAGGAAGCTTTGCCAGAGG AAATGCGACACTGCGACATA GTCTGAGGGCAAACTGAAGG

Histone H3 (H3)

2e¡17

Terminal Xower 1 (TFL1)

1e¡68

Aldehyde dehydrogenase (ALDH)

1e¡101

Defensin protein (DEFENSIN)

4e¡25

Acetyl-CoA carboxylase (ACETYL)

2e¡84

Heat shock protein 70 (HSP70-1)

2e¡90

Apetala-III (AP3-III)

NA

Pistillata-AqaPI-1 (PI)

NA

Agamous-AqaAG-2 (AGAM)

NA

DQ224255

AAACCTGAAGCAGCAATAGGA CAAACTTCCCTTCCAACGTC AACTTCCGATATATTTCATTCATTG TCAAAGGCAACCGTCACTC GCATTAAAGTAGGCGCAAGC GCTTTCAACTTTCCCTGTGC TCATGCAGAAGCAGTCTTCG GCAACATGCGTCTAGTTTCAG GAACCACGAAGGTGACCCT ATTCGCGGAGCTACATGATA CCTACTGCTACTTTCAACAATCAAC TCTTCAGGGAGAGAGAGAGTTTG ATTACTTCCCCACCATCAGG TGAGTCTGTGAAACTTGTTCGGG GCAATGCGAATAGCAATGCC CGAACTCAGGCACTTGAAGG GCATTGTTGAATGTTGATACACTCT CATTTGATGGGTGAGGCTCT CATATGTTTGGAGGGCACAC

DQ286960 DQ286962 DQ286963 DQ224256 DQ286961 DQ224254 AY162851 AY162852 AY464110

Eight cDNAs were identiWed based on tblastx expectation values for matches to known proteins. We then designed a forward primer in the coding region (the Wrst primer listed for each gene) and a reverse primer in the 3⬘-UTR (second primer listed for each gene). Three additional primer pairs were developed based on Aquilegia MADS-box cDNAs. Primers for AP3-III are directly from Kramer et al. (2003). Table 3 IdentiWcation of primers that amplify introns for eight genes Locus HSP70-1 G3PDH PI AP3-III DEFENSIN ACETYL H3 3GT

Predicted size (bp) a

550 (7) 401 (207) 550 (256) 512 (261) 332 (121) 478 (38) 358 (147) 473 (48)

Actual size (bp)

Intron size (bp)

No. of introns

Aquilegia

Semiaquilegia

Enemion

Thalictrum

1281 1097 1200 1093 683 577 358 473

731 696 650 581 351 99 0 0

2 2 NA 3 1 1 0 0

++ + + + + + + +

+

++

++

+

++ +

+ +

For each gene, the predicted size of the PCR product based on the cDNA sequence is given as well as the actual size of the genomic DNA sequence, except for PI where the actual size was determined by agarose gel electrophoresis. The size of the intron(s) is the diVerence between the actual and predicted PCR product sizes. The number of introns was determined by comparing the sequence of the cDNA to the sequences of the PCR products for each gene. Whether the PCR primers ampliWed a single (+) or multiple (++) bands is indicated for each genus surveyed for this study. a Inferred 3⬘-UTR lengths indicated in parentheses.

volumes with the following reagents (and their Wnal concentrations); Taq GOLD polymerase buVer (1£, Perkin Elmer), dNTPs (0.25 mM each), MgCl2 (2.5 mM), forward and reverse primers (1 M each), Taq GOLD polymerase (1 unit, Perkin Elmer), and 20 ng template DNA. Optimized cycling conditions for all loci were 92 °C for 2 min, followed by 35 rounds of 92 °C for 45 s, 61 °C for 30 s, 72 °C for 1 min 30 s, then a Wnal extension of 72 °C for 10 min.

aquilegia. We have also estimated the relative levels of variation between benchmark clades in terms of the number of variable characters. Three North American Aquilegia species spanning the range of pollination syndromes and habitat types were compared to two geographically distant Eurasian Aquilegia species and S. adoxoides, a member of the sister genus to Aquilegia (Table 1). DNA was extracted from 100 mg of fresh leaf tissue using the DNeasy kit (Qiagen) of one individual per species.

2.3. Aquilegia survey 2.4. Cloning and sequencing To assess orthology among the Wve loci with the largest introns (Table 3), we tested two benchmark phylogenetic hypotheses: (1) the North American Aquilegia species are monophyletic, and (2) the sister genus to Aquilegia is Semi-

PCR products were gel puriWed from 1% agarose gels following the Qiagen DNA puriWcation protocol, then used as templates in the pGEM T-Easy cloning kit (Promega).

128

J.B. Whittall et al. / Molecular Phylogenetics and Evolution 39 (2006) 124–134

Three or more positive colonies per sample (with the correct insert size) were identiWed with PCR ampliWcations using M13 primers. These ampliWcation products were used directly in simultaneous bi-directional sequencing following the protocol for the Thermosequenase kit (Amersham) with internal primers T7 and pGEM + 46 (CCGCGGGA ATTCGAT) containing Xuorescent IRD labels 800 and 700 (Li-Cor), respectively. Sequencing was conducted on a LiCor 4200 sequencer with 41 and 66 cm gels composed of 5.5 and 3.75% acrylamide, respectively. 2.5. Sequence analysis Gels were automatically analyzed with eSeq V2.0 (LiCor, Biotechnology Division). Contigs were assembled and aligned with AlignIR V2.0 (Li-Cor, Biotechnology Division), then visually inspected for accuracy. Taq error during PCR ampliWcation of template DNA is likely to be random, and therefore unique to individual clones, and its inclusion as true variation, would lead to an overestimation of autapomorphies (Beltran et al., 2002). To avoid overestimation of autapomorphies we sequenced a minimum of three clones per species per gene and then the consensus sequence was utilized for phylogeny reconstruction unless two clones represented each variant within an individual. In such cases we assumed that the individual was a true heterozygote and both sequences were used for phylogeny reconstruction. All variable sites that were ignored were unique to a single clone both from within that individual and among clones from the other species sampled. Thus, if a true heterozygote was ignored then the variation would have been autapomorphic and would not have changed the inferred phylogenetic relationships. This approach also provides a conservative estimate of the number of variable sites. The location of introns was determined by comparison to the cDNA sequence (Fig. 1) and the 3⬘-UTR was determined as the sequence 3⬘ of the inferred stop codon for the reading frame with the highest tBlastx hit (Fig. 1). Indels were coded as present/absent regardless of their length and included in subsequent sequence analyses. We aimed to validate each of the phylogenetic benchmark hypotheses for each locus separately with exhaustive searches using unweighted maximum parsimony in PAUP* (SwoVord, 1998). For the two loci that S. adoxoides did not amplify, we were only able to compare one benchmark phylogenetic relationship. Topological support was determined with maximum parsimony bootstrap analysis (1000 replicates). Congruence between the four loci was analyzed with the partition homogeneity test (Farris et al., 1995) in PAUP* with 1000 replicates (constant characters excluded). Additional tests to isolate any sources of incongruence were conducted by removing selected taxa and rerunning the analyses. A combined analysis of all four datasets including indels was conducted using identical methodology as described for the analysis of individual loci.

The phylogenetic utility of the experimental datasets was Wrst determined by counting the number of parsimony informative sites for all Aquilegia species sampled. The number of parsimony informative sites underestimates the number of phylogenetically useful characters with our limited sampling. SpeciWcally, many variable sites that are currently autapomorphic will be shared with other taxa upon additional sampling. Therefore, we have also compared the number of variable sites for each locus to that of ITS (Hodges and Arnold, 1994). Variable sites were identiWed as those characters with more than one character state (not including ambiguities). To compare variation based on identical taxon sampling with that for ITS, we removed the Aquilegia pyrenaica sample from the nuclear intron datasets. The number of variable sites among all Aquilegia samples for each locus was also calculated for the three functional subsets of the data: exon, intron, and 3⬘-UTR regions. 3. Results 3.1. cDNA sequences To identify gene-speciWc primers for Aquilegia, we sequenced 116 clones from the cDNA library. Approximately 72 clones (62%) matched database sequences of known proteins (we used Genbank expected value <10¡3 and Identities of >50% for >25 contiguous amino acids as our cutoV for determining a match). Even though cDNA sequences were read in only the forward direction, the majority of cDNAs contained a substantial amount of inferred 3⬘-UTR sequence based on the position of a stop codon in the reading frame that produced the best match (Fig. 1). In addition, three MADS-box genes isolated from Aquilegia alpina (Kramer et al., 2003) were surveyed for nuclear introns using 3⬘-UTR anchored primers. 3.2. AmpliWcation of genomic DNAs We sought to design primers to amplify single-gene PCR products with large introns for species-level phylogenetic reconstruction in Aquilegia. Therefore, we developed primers for eleven 3⬘-UTR anchored loci and surveyed genomic DNA for introns. Introns were indicated when PCR products from genomic DNA were longer than the expected lengths from the cDNA sequence (Fig. 1). Eight loci were successfully ampliWed for three accessions of North American Aquilegia and two accessions of Old World Aquilegia (Table 1). Six of these loci produced genomic ampliWcation products that were substantially longer than predicted from the cDNA indicating the presence of one or more introns (Table 3). In addition, three of these eight loci also ampliWed in S. adoxoides (HSP70-1, AP3-III, and ACETYL) (Table 3). Furthermore, two primers ampliWed one or more products in single species of the related genera, Enemion and Thalictrum (HSP70-1 and ACETYL). DEFENSIN ampliWed multiple products in T. fendleri, but did not amplify in E. occidentalis or S. adoxoides.

J.B. Whittall et al. / Molecular Phylogenetics and Evolution 39 (2006) 124–134

3.3. Cloning and sequencing To identify the necessary variation for species-level phylogenetic reconstruction in Aquilegia, we attempted to clone the Wve loci with the largest introns (PI, AP3-III, HSP70-1, G3PDH, and DEFENSIN). No positive colonies were isolated from the PI cloning reaction, possibly due to its length (approximately 1200 bp). The following results pertain to the remaining four loci. Three to sixteen clones per species were sequenced for each locus (mean D 5.5). If variation was unique to a single clone, when compared across all species, it was interpreted as Taq error and removed. We quantiWed Taq error as the number of clones with 0, 1, 2, or 3 nucleotide diVerences from the consensus sequence of a species. The Taq error distributions, summed across species for each locus, are: 11, 18, 5, and 1 (Hsp70-1); 12, 15, 5, and 2 (AP3-III); 9, 14, 3, and 1 (G3PDH); 17, 10, 2, and 0 (DEFENSIN). The inferred rate of error was 6.9 £ 10¡4, 8.3 £ 10¡4, 7.8 £ 10¡4, and 7.1 £ 10¡4 errors/bp for Hsp70, AP3-III, G3PDH and DEFENSIN, respectively. Only one heterozygote was conWdently identiWed based on two unique indels (Aquilegia olympica in AP3-III). The amount of intron sequence

129

varied per locus from 99 to 731 bp (mean D 518 bp) (Table 3). Aquilegia and Semiaquilegia sequences submitted to Genbank have the following accession numbers in the sample order listed in Table 1 beginning with the cDNA sequence and followed by the samples PCR ampliWed and sequenced; AP3-III (DQ224257, DQ224264–DQ224270), Hsp70-1 (DQ224254, DQ224258–DQ224263), G3PDH (DQ224255, DQ217409–DQ217413), and DEFENSIN (DQ224256, DQ224271–DQ224275). We sought to determine the utility of the 3⬘-UTR anchored primer technique in isolating single orthologous loci from both low copy loci and loci that are members of large gene families. Although all loci initially appeared as single bands on diagnostic agarose gels suggesting single copy PCR products, two distinct copies of HSP70-1 were cloned from A. olympica and S. adoxoides. The most frequent clones aligned with the remaining Aquilegia samples and were used in later analyses (Fig. 2, Tables 3 and 4). The rare form of Hsp70-1 had a very similar coding region to the common form (only 3.6% variable sites in the exons), yet the introns could not be aligned. Blast searches of both the protein and nucleotide databases return similar results for both the common form and the

Fig. 2. The single most parsimonious tree for each locus including indels are labeled AP3-III (A); Hsp70-1 (B); G3PDH (C); DEFENSIN (D). Branch lengths are indicated above the branches and bootstrap percentages are shown below the branches when greater than 70%. The North American clade (NA) and Eurasian clade (EA) are indicated.

130

J.B. Whittall et al. / Molecular Phylogenetics and Evolution 39 (2006) 124–134

Table 4 Combined nuclear intron and 3⬘-UTR variation compared to ITS variation within Aquilegia species (A. pyrenaica has been removed so that all loci have identical sampling) Locus

Variable sites (%)

Variable sites (n)

HSP70-1 G3PDH AP3-III DEFENSIN

7.59 2.21 1.66 1.69

56 20 14 8

Total ITS

3.32 2.33

98 10

rare variant due to similar matches in the nearly identical exon regions. The amount of variation in the 3⬘-UTR between these two duplicated copies cannot be assessed since only 7 bp of 3⬘-UTR were included in the HSP70-1 locus. Reanalysis of the original PCR products indicates that two fragments diVering by less than 50 bp were ampliWed in these two taxa. The longer fragment (and the more frequent clone matching the locus isolated from the remaining species) was preferentially ampliWed at higher annealing temperatures. 3.4. Sequence comparisons Orthology is a prerequisite for accurate phylogenetic reconstruction (Sang, 2002). We used the accepted benchmark relationships and the relative levels of variation across the species to assess orthology. The single most parsimonious trees for each locus analyzed individually were consistent with the benchmark relationship(s) except in branches with very low bootstrap support (e.g., AP3-III Fig. 2). For example, the monophyly of the North American Aquilegia clade is supported by 100, 100, and 99% bootstrap values in HSP70-1, G3PDH and DEFENSIN, respectively (Figs. 2B, C, and D). The second benchmark relationship, the sister genus relationship with S. adoxoides is supported by 38 and 25 non-homoplasious changes in AP3-III and HSP70-1, respectively (Figs. 2A and B). This second benchmark relationship could not be assessed in G3PDH and DEFENSIN since the primers did not amplify in S. adoxoides. Incongruence between gene genealogies is another method to quantify topological consistency among loci, a characteristic of orthology. Incongruence was assessed with the partition homogeneity test and an analysis of homoplasy in the combined dataset. The partition homogeneity test did not identify signiWcant incongruence between the four loci (p D 0.185). The incongruence, although not signiWcant, was eliminated when the A. formosa sample from the AP3-III locus was removed and the partition homogeneity test rerun (p D 1.000). An alternative estimate of consistency within and between loci is an analysis of homoplasy. Only three out of 186 variable characters (including S. adoxoides) were homoplasious on the combined tree (RC D 0.964) (Fig. 3). Furthermore, the two benchmark relationships are still

Fig. 3. The single most parsimonious tree from the combined data of four loci including indels. Branch lengths are indicated above the branches and bootstrap percentages are shown below the branches.

maintained with high support in the combined analysis. In addition, the combined analysis indicates a strongly supported relationship for the previously unresolved taxon, Aquilegia chrysantha, within the North American clade (i.e., 99% bootstrap support) (Fig. 3), given the limited sampling. A qualitative comparison of the relative levels of variation across regions (exons, introns, and 3⬘-UTRs) was conducted to supplement the benchmark relationships as a criterion for orthology. Although each locus has a unique distribution of variation across these gene regions, a general pattern appears in the majority of comparisons; the highest variation is in introns followed by the 3⬘-UTRs and the most conservative regions are the exons, as expected when comparing orthologues (Fig. 4). SpeciWcally, with the exception of G3PDH, all loci have higher intron variation than exon variation (AP3-III, Hsp70-1, and DEFENSIN). The relative conservation of coding sequences is consistent with these genes being expressed and therefore they are unlikely pseudogenes (Lin et al., 2001). In addition, all loci with 3⬘-UTR sequence diVerences have more variation in the 3⬘-UTR than in the exon (AP3-III, G3PDH, and DEFENSIN, although no variation detected in the 7 bp of the HSP701 3⬘-UTR) (Fig. 4). Of note is the 4-fold diVerence in the percentage of variable sites in the HSP70-1 intron between the North American and Eurasian Aquilegia clades (Fig. 2B, Fig. 4). Phylogenetic analysis of the

J.B. Whittall et al. / Molecular Phylogenetics and Evolution 39 (2006) 124–134

131

4. Discussion 4.1. Successful isolation of nuclear introns Nuclear introns are a burgeoning source of DNA sequence variation for species-level phylogenetic studies. Here we have demonstrated a new method relying on designing speciWc primers including one in the 3⬘-UTR of nuclear genes. This method provides a promising alternative to previous degenerate primer methods for the isolation of single-copy nuclear loci. Because it is relatively easy to generate cDNA libraries, and thus the necessary information for speciWc primer design (now easily automated), this method should be widely applicable to non-model taxa. Furthermore, half of the loci that we ampliWed with this method contained introns and sequencing of four of these genes resulted in nearly 10x the sequence variation found in ITS alone. The ability to quickly identify intron-containing loci before cloning and sequencing makes this method particularly eYcient. Fig. 4. Comparison of the percent variable sites per region including indels among Wve Aquilegia species. No variation in the 3⬘-UTR of Hsp70-1 was found in the 7 bp surveyed.

Hsp70-1 sequences produces a topology consistent with orthologues, yet the relative levels of variation suggest paralogy. If this were a case of paralogy, it would require reciprocal losses between the North American and Eurasian samples (see Discussion Section 4.3). Within the North American and Eurasian clades, Hsp70-1 maintains the expected levels of variation. To determine the usefulness of nuclear introns for species-level phylogenies, we calculated two measures of variation: parsimony informative sites and total variable sites. Parsimony informative sites for the Aquilegia samples were 49 (Hsp70-1), four (AP3), eight (G3PDH), and four (DEFEN). The number of parsimony informative sites underestimates the number of potentially phylogenetically informative sites when sampling is incomplete. In this case, we have sampled only Wve out of approximately 75 Aquilegia species (Munz, 1946; Whittemore, 1997). With such limited sampling, several sites that are currently autapomorphic (and therefore not parsimony informative sites) could be shared with other species with additional sampling. Therefore, we emphasize the comparison of the total percent variable sites among Aquilegia species in the most variable regions (intron and 3⬘-UTR) with that of the ITS (Table 4). Although most loci are less variable than the ITS (0.12–0.67% less variation than ITS), when the total number of variable sites are summed across the four new nuclear loci, they outnumber the variation in ITS nearly ten to one (Table 4). The HSP70-1 locus, contains more than three times the percent variable sites as ITS (see Section 4.3).

4.2. Substantial nuclear variation Not only is this method eYcient at isolating numerous nuclear intron-containing loci, each locus also contains substantial variation useful for species-level phylogenetic studies. The exact levels of variation diVer among loci, consistent with reports of intron variation from other taxa (Beltran et al., 2002; DeBry and Seshadri, 2001; Sang, 2002). A general pattern of the distribution of variation among regions does exist; introns are more variable than 3⬘-UTRs, and both are more variable than the coding regions. These results support our reasons for anchoring a primer in the 3⬘-UTR: to gain speciWcity and to avoid amplifying paralogous loci. Unfortunately, the speciWcity gained by using the 3⬘-UTR comes at a cost, since priming site substitutions lead to PCR failure. These null alleles will become increasingly common as the genetic distance increases from the species from which the cDNA was sequenced potentially restricting this method to closely related taxa. In our sampling, introns had approximately 1.5£ more variable sites (5.47% variable sites) compared to the 3⬘UTR (3.69% variable sites). The lower variation in the 3⬘UTR relative to the introns may be due to evolutionary constraints limiting variation in the 3⬘-UTR since it can function as a regulator of transcription (Mazumder et al., 2003; Williams et al., 1999). For example, large 3⬘-UTR deletions in the chloroplast gene rbcL cause a lack of chlorophyll in some parasitic Orobanche species (Wolfe and dePamphilis, 1997). Although the levels of nuclear variation are variable and, on average, lower than that of ITS, we generated nearly 10 times the number of ITS variable sites with only four additional nuclear loci. If additional variation is necessary, one can survey the numerous additional cDNA sequences using this 3⬘-UTR anchored primer technique. In choosing the

132

J.B. Whittall et al. / Molecular Phylogenetics and Evolution 39 (2006) 124–134

location of primers to maximize the number of variable sites, extending the forward exon primer upstream would be most productive since it increases the possibility of isolating introns which contain the highest variation on average. The Wxed length and less variable 3⬘-UTR sequences can be minimized, yet designing primers immediately adjacent to the exon may increase the chances of amplifying paralogous loci due to the conserved motifs regulating post transcriptional mRNA processing. In addition, for the genes we investigated, there are more indels per bp sequenced in the 3⬘-UTR (1.2% variable sites) compared to introns (0.8% variable sites). Although this variation provides a source of phylogenetically useful variation, if two alleles with length diVerences are found in a single individual as a heterozygote, direct sequencing can be complicated (Whittall et al., 2000). 4.3. ConWrmed orthologous loci The identiWcation of paralogous loci is much easier than the conWrmation of orthologous loci (Sang, 2002). Here we brieXy review background knowledge of copy number for the four loci surveyed, then describe the results of our two criteria for assessing orthology (benchmark phylogenetic relationships and comparative levels of variation). The four loci investigated can be divided into two classes based on the sizes of their gene families. AP3-III and G3PDH belong to small gene families (Kramer et al., 2003; Olsen and Schaal, 1999). Three copies of AP3 were isolated from the Eurasian A. alpina investigated by Kramer et al. (2003). Alignment of the exons and the 3⬘-UTRs of these three paralogs from A. alpina indicates that exon sequences are 58% conserved, yet the 3⬘-UTRs diVer dramatically in length and the sequences are unalignable. G3PDH is also a member of a three-gene family, but with very deep divisions between the duplicates (Olsen and Schaal, 1999; Russell and Sachs, 1991). Furthermore, upstream G3PDH introns have been isolated with degenerate primers and compared both within Aquilegia species and between Aquilegia and the more phylogenetically distant Enemion with no evidence of paralogy (Whittall and Hodges, unpublished). Alternatively, heat shock proteins and defensins belong to large gene families (18 copies of Hsp70 and 13 copies of this defensin gene are found in Arabidopsis) (Lin et al., 2001; Thomma et al., 2002). In Drosophila, the diverse Hsp70 gene family maintains 85% amino acid identity through the process of concerted evolution (Bettencourt and Feder, 2002). The homogenized coding regions would seemingly confound a degenerate primer approach, yet the 3⬘-UTR anchored reverse primer method only isolated one conWrmed paralogous copy of HSP70-1 as the minority of clones from just two of six species. This aberrant copy had very similar exon sequence (3.6% divergence) and identical intron locations, yet unalignable intron sequences. Both copies of HSP70-1 blast to the same Hsp70 copy from Arabidopsis (AtHsp70-1) (Lin et al., 2001), suggesting duplication since the divergence of Arabidopsis Hsp70-1 from Aquilegia and Semiaquilegia.

We used two comparative tests to assess orthology: phylogenetic congruence with benchmark relationships and consistency with expected levels of variation. The benchmark relationships that were possible to directly test were conWrmed with high bootstrap support for all loci except in one case with non-signiWcant bootstrap support (AP3-III, Fig. 2A). For two genes, G3PDH and DEFENSIN, the benchmark relationship of monophyly for Aquilegia could not be assessed because the genes did not amplify in S. adoxoides. The failure to amplify these genes does, however, provide some support for the benchmark relationship. As noted above, as phylogenetic distance increases from the taxon used to design primers, it is more likely that variation at the primer sites will cause failed PCR ampliWcation. Thus, the failure of S. adoxoides to amplify for these two genes, in contrast to all species of Aquilegia, supports the monophyly of this genus. Additional measures of the support for a particular phylogenetic relationship are the number of additional steps necessary to decay a particular branch and the level of homoplasy. Decay values for one benchmark relationship, North American clade monophyly, are one, four, six, and 50 additional steps for AP3-III, DEFENSIN, G3PDH and HSP70-1, respectively. The combined analysis of all genes also strongly supported these benchmark relationships with little evidence of homoplasy for the characters supporting them (only 3 of 186 variable characters). Similar use of benchmark relationships to assess orthology was applied to nuclear intron surveys in Gossypium species (Cronn et al., 2002; Small and Wendel, 2000) and Glycine species (Doyle et al., 2003). In these comparisons, orthology was also conWrmed with Southern blotting and physical mapping (Small and Wendel, 2000). These two additional methods are often cited as the Wnal step in identifying paralogy, yet they are labor intensive and may fail to identify paralogy depending on the speciWcity of the reaction conditions and in cases of recent duplication events. Alternatively, our comparative method of identifying orthology with benchmark relationships and expected levels of variation should detect most cases of paralogy, except in rare cases of extremely recent duplication events. There are substantially more variable sites in the apparently orthologous HSP70-1 phylogeny than any other locus investigated (Fig. 4, Table 4). The majority of these variable sites are Wxed diVerences between North American and Eurasian Aquilegia species. SpeciWcally, 73% of variable sites in HSP70-1 map non-homoplasiously to the NA-EA divergence compared to 4, 25, and 40% in DEFENSIN, G3PDH and AP3-III, respectively. The disproportionate number of substitutions between the NA and EA Aquilegia HSP70-1 sequences could be due to comparisons of gene duplications much earlier than the NA-EA branch point. If these sequences are paralogs, then the NA and EA lineages would have to have strong preferential ampliWcation of alternate paralogs as we sampled a total of 20 clones from each lineage and did not identify any violations of the benchmark relationship. Variation at the orthologous

J.B. Whittall et al. / Molecular Phylogenetics and Evolution 39 (2006) 124–134

primer site in the EA clade could be such that a paralogous locus is actually preferentially ampliWed. The fact that conWrmed paralogs were ampliWed in A. olympica and Semiaquilegia gives credence to this interpretation. An alternate explanation to paralogy is that this locus may have undergone an increased substitution rate due to some unique selective forces at or near this locus in the ancestor of either the NA or EA clade. In Drosophila, a lower eVective population size caused by selective sweeps has led to increased levels of variation at the Hsp locus (Bettencourt and Feder, 2002). SpeciWcally, the action of concerted evolution and gene conversion in homogenizing the exons could also Wx allelic variation in the introns of these loci. Determining the roles of gene duplication (paralogy) and selection (orthology) to explain the exceptional variation in NA-EA HSP70-1 comparisons will require identiWcation of copy number through either Southern blotting and/or physical mapping of the genome (Hodges et al., 2002; Sang, 2002). Because of these diYculties, we suggest that loci from known large gene families should be avoided when alternative loci from smaller families exist (Sang, 2002). Nevertheless, remarkably, this method shows that the 3⬘-UTR anchored primer approach can amplify sequences that maintain benchmark relationships and expected levels of variation consistent with orthology in at least three of the four loci tested. 4.4. Further evidence for a rapid radiation in Aquilegia The depauperate DNA sequence variation among worldwide Aquilegia species conWrms the rapid and recent nature of this adaptive radiation (Hodges and Arnold, 1994). Several nuclear introns will be required to provide the variation necessary to resolve the relationships within Aquilegia. With the addition of four nuclear intron-containing loci described herein, we have increased the number of variable sites nearly 10-fold. The lower levels of homoplasy in analysis of nuclear loci (RC D 0.964) compared to the ITS analysis (RC D 0.644) (Hodges and Arnold, 1994) suggests there is consistent phylogenetic signal within and between gene genealogies. Many of the numerous autapomorphies for each Aquilegia species are likely to become parsimony informative sites when the remaining Aquilegia species are added. Furthermore, the nuclear loci also provide strong support for the placement of the previously unresolved A. chrysantha, given the limited sampling. Resolving the remainder of the Aquilegia phylogeny will require more extensive sampling than typical species-level phylogenies. The potential for polymorphism within and between species will require larger sample sizes. Multiple individuals per population for numerous populations from each species should be included for each gene genealogy. Congruence between multiple gene genealogies will be sought to minimize the eVects of lineage sorting. This method should provide the necessary nuclear variation to resolve the Aquilegia phylogeny and many other specieslevel phylogenies for recently radiating non-model taxa.

133

Acknowledgments The authors thank Dr. Douglas Bush, Brian Counterman, Tania Bricker-Shanahan, and Ji Yang for assistance in the lab. Susan Mazer, Ji Yang, and two anonymous reviewers provided helpful suggestions on earlier drafts of the manuscript. We gratefully acknowledge grant support from the NSF (DEB-0129130 and EF-0412727) and a UCSB Faculty Research Grant to S.A.H. J.B.W. was supported on a teaching assistantship from the Department of Ecology, Evolution and Marine Biology (UCSB) and the Olivia Long Converse Graduate Fellowship (UCSB). A.M.M. was supported by the Smithsonian’s Minority Internship Program. This research represents a portion of J.B.W.’s Ph.D. dissertation. References Allen, E., Omland, K., 2003. Novel intron phylogeny supports plumage convergence in orioles (Icterus). The Auk 120, 961–969. Altschul, S.F., Madden, T.L., SchaVer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J., 1997. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Research 25, 3389–3402. Avise, J.C., 1994. Molecular Markers, Natural History and Evolution. Chapman Hall, New York. Barker, N., Senger, I., Howis, S., Zachariades, C., Ripley, B., 2005. Plant phylogeography based on rDNA ITS sequence data: two examples from the Asteraceae. In: Bakker, F.T., Chatrou, L.W., Gravendeel, B., Pelser, P.B. (Eds.), Plant Species-Level Systematics: New Perspectives on Pattern and Process. A.R.G. Gantner Verlag, Ruggell, pp. 217–244. Beltran, M., Jiggins, C.D., Bull, V., Linares, M., Mallet, J., McMillan, W.O., Bermingham, E., 2002. Phylogenetic discordance at the species boundary: Comparative gene genealogies among rapidly radiating Heliconius butterXies. Molecular Biology and Evolution 19, 2176–2190. Bettencourt, B.R., Feder, M.E., 2002. Rapid concerted evolution via gene conversion at the Drosophila hsp70 genes. Journal of Molecular Evolution 54, 569–586. Bradshaw, H.D., Wilbert, S.M., Otto, K.G., Schemske, D.W., 1995. Genetic mapping of Xoral traits associated with reproductive isolation in monkeyXowers (Mimulus). Nature 376, 762–765. Cronn, R.C., Small, R.L., Haselkorn, T., Wendel, J.F., 2002. Rapid diversiWcation of the cotton genus (Gossypium: Malvaceae) revealed by analysis of sixteen nuclear and chloroplast genes. American Journal of Botany 89, 707–725. DeBry, R.W., Seshadri, S., 2001. Nuclear intron sequences for phylogenetics of closely related mammals: An example using the phylogeny of Mus. Journal of Mammalogy 82, 280–288. Doyle, J.J., Doyle, J.L., Harbison, C., 2003. Chloroplast-expressed glutamine synthetase in Glycine and related Leguminosae: Phylogeny, gene duplication, and ancient polyploidy. Systematic Botany 28, 567– 577. Driskell, A., Christidis, L., 2004. Phylogeny and evolution of the AustraloPapuan honeyeaters (Passeriformes, Meliphagidae). Molecular Phylogenetics and Evolution 31, 943–960. Farris, J.S., Kallersjo, M., Kluge, A.G., Bult, C., 1995. Testing signiWcance of incongruence. Cladistics 10, 315–319. Givnish, T.J., Sytsma, K.J., 1997. Molecular Evolution and Adaptive Radiation. Cambridge University Press, New York. Harvey, P.H., 1996. New Uses for New Phylogenies. Oxford University Press, New York. Helbig, A., Kocum, A., Seibold, I., Braun, M., 2005. A multi-gene phylogeny of aquiline eagles (Aves: Accipitriformes) reveals extensive paraphyly at the genus level. Molecular Phylogenetics and Evolution 35, 147–164.

134

J.B. Whittall et al. / Molecular Phylogenetics and Evolution 39 (2006) 124–134

Hodges, S.A., Arnold, M.L., 1994. Columbines: a geographically widespread species Xock. Proceedings of the National Academy of Sciences USA 91, 5129–5132. Hodges, S.A., Arnold, M.L., 1995. Spurring plant diversiWcation: are Xoral nectar spurs a key innovation? Proceedings of the Royal Society of London B 262, 343–348. Hodges, S.A., Whittall, J.B., Fulton, M., Yang, J.Y., 2002. Genetics of Xoral traits inXuencing reproductive isolation between Aquilegia formosa and Aquilegia pubescens. American Naturalist 159, S51–S60. Howarth, D.G., Baum, D.A., 2005. Genealogical evidence of homoploid hybrid speciation in an adaptive radiation of Scaevola (Goodeniaceae). Evolution 59, 948–961. Johnson, K., Clayton, D., 2000. Nuclear and mitochondrial genes contain similar phylogenetic signal for pigeons and doves (Aves: Columbiformes). Molecular Phylogenetics and Evolution 14, 141–151. KornWeld, I., Smith, P.F., 2000. African cichlid Wshes: Model systems for evolutionary biology. Annual Review of Ecology and Systematics 31, 163–196. Kramer, E.M., Stilio, V.S.D., Schluter, P.M., 2003. Complex patterns of gene duplication in the apetala3 and pistillata lineages of the Ranunculaceae. International Journal of Plant Sciences 164, 1–11. Kupfermann, H., Satta, Y., Takahata, N., Tichy, H., Klein, J., 1999. Evolution of Mhc-DRB Introns: Implications for the origin of primates. Journal of Molecular Evolution 48, 663–674. Lavoue, S., Sullivan, J.P., Hopkins, C.D., 2003. Phylogenetic utility of the Wrst two introns of the S7 ribosomal protein gene in African electric Wshes (Mormyroidea: Teleostei) and congruence with other molecular markers. Biological Journal of the Linnean Society 78, 273–292. Li, W.-H., 1997. Molecular Evolution. Sinauer Associates, Sunderland. Lin, B.L., Wang, J.S., Liu, H.C., Chen, R.W., Meyer, Y., Barakat, A., Delseny, M., 2001. Genomic analysis of the Hsp70 superfamily in Arabidopsis thaliana. Cell Stress and Chaperones 6, 201–208. Mazumder, B., Seshadri, V., Fox, P.L., 2003. Translational control by the 3⬘-UTR: The ends specify the means. Trends in Biochemical Sciences 28, 91–98. Munz, P.A., 1946. Aquilegia: The cultivated and wild columbines. Gentes Herbarium 7, 1–150. Olsen, K.M., Schaal, B.A., 1999. Evidence on the origin of cassava: phylogeography of Manihot esculenta. Proceedings of the National Academy of Sciences of the United States of America 96, 5586–5591. Page, R.D. (Ed.), 2002. Tangeld Trees: Phylogeny, Cospeciation, and Coevolution. University of Chicago Press, Chicago. Palumbi, S.R., Baker, C.S., 1994. Contrasting population structure from nuclear intron sequences and mtDNA of humpback whales. Molecular Biology and Evolution 11, 426–435. Palumbi, S.R., Cipriano, F., Hare, M.P., 2001. Predicting nuclear gene coalescence from mitochondrial data: The three-times rule. Evolution 55, 859–868. Peters, J., McCracken, K.G., Zhuravlev, Y., Lu, Y., Wilson, R., Johnson, K., Omland, K., 2005. Phylogenetics of wigeons and allies (Anatidae: Anas): the importance of sampling multiple loci and multiple individuals. Molecular Phylogenetics and Evolution 35, 209–224. Ro, K.E., Keener, C.S., McPheron, B.A., 1997. Molecular phylogenetic study of the Ranunculaceae: Utility of the nuclear 26S ribosomal DNA in inferring intrafamilial relationships. Molecular Phylogenetics and Evolution 8, 117–127.

Rundle, H.D., Nosil, P., 2005. Ecological speciation. Ecology Letters 8, 336–352. Russell, D.A., Sachs, M.M., 1991. The maize cytosolic glyceraldehyde-3phosphate dehydrogenase gene family organ-speciWc expression and genetic analysis. Molecular and General Genetics 229, 219–228. Sang, T., 2002. Utility of low-copy nuclear gene sequences in plant phylogenetics. Critical Reviews in Biochemistry and Molecular Biology 37, 121–147. Sappl, P., Heazlewood, J., Millar, A., 2004. Untangling multi-gene families in plants by integrating proteomics into functional genomics. Phytochemistry 65, 1517–1530. Schaal, B.A., Olsen, K.M., 2000. Gene genealogies and population variation in plants. Proceeding of the National Academy of Sciences of the United States of America 97, 7024–7029. Schluter, D., 2000. The Ecology of Adaptive Radiation. Oxford University Press, New York. Seehausen, O., Koetsier, E., Schneider, M.V., Chapman, L.J., Chapman, C.A., Knight, M.E., Turner, G.F., Van Alphen, J.J.M., Bills, R., 2003. Nuclear markers reveal unexpected genetic variation and a CongoleseNilotic origin of the Lake Victoria cichlid species Xock. Proceedings of the Royal Society of London Series B: Biological Sciences 270, 129– 137. Shedlock, A., Takahashi, K., Okada, N., 2004. SINEs of speciation: tracking lineages with retroposons. Trends in Ecology and Evolution 19, 545–553. Small, R.L., Wendel, J.F., 2000. Phylogeny, duplication, and intraspeciWc variation of Adh gene family in diploid and tetraploid cotton (Gossypium). Genetics 155, 1913–1926. Steiner, C., Tilak, M., Douzery, E., CatzeXis, F., 2005. New DNA data from a transthyretin nuclear intron suggest an Oligocene to Miocene diversiWcation of living South American opossums (Marsupialia: Didelphidae). Molecular Phylogenetics and Evolution 35, 363–379. Strand, A.E., Leebens-Mack, J., Milligan, B.G., 1997. Nuclear DNA-based markers for plant evolutionary biology. Molecular Ecology 6, 113–118. SwoVord, D.L., 1998. PAUP*: phylogenetic analysis using parsimony (* and other methods). Sunderland, Sinauer. Thomma, B.P.H.J., Cammue, B.P.A., Thevissen, K., 2002. Plant defensins. Planta 216, 193–202. Weibel, A.C., Moore, W.S., 2002. A test of a mitochondrial gene-based phylogeny of woodpeckers (Genus Picoides) using an independent nuclear gene, B-Fibrinogen intron 7. Molecular Phylogenetics and Evolution 22, 247–257. Whittall, J.B., Liston, A., Gisler, S., Meinke, R.J., 2000. Detecting nucleotide additivity from direct sequences is a SNAP: An example from Sidalcea (Malvaceae). Plant Biology 2, 211–217. Whittemore, A.T., 1997. Aquilegia. In: Whittemore, A.T. (Ed.), Flora of North America. Oxford University Press, New York, pp. 249–258. Williams, G.D., Chang, R.Y., Brian, D.A., 1999. A phylogenetically conserved hairpin-type 3⬘ untranslated region pseudoknot functions in coronavirus RNA replication. Journal of Virology 73, 8349–8355. Wolfe, A.D., dePamphilis, C.W., 1997. Alternate paths of evolution for the photosynthetic gene rbcL in four nonphotosynthetic species of Orobanche. Plant Molecular Biology 33, 965–977. Wortman, J., Haas, B., Hannick, L., Smith, R., Maiti, R., Ronning, C., Chan, A., Yu, C., Ayele, M., Whitelaw, C., White, O., Town, C., 2003. Annotation of the Arabidopsis genome. Plant Physiology 132, 461–468.

Generating single-copy nuclear gene data for a recent ...

The authors thank Dr. Douglas Bush, Brian Counter- man, Tania ... Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller,. W., Lipman, D.J., 1997 ...

360KB Sizes 2 Downloads 123 Views

Recommend Documents

San Onofre Nuclear Generating Station, Unit 2 - Request for ... - NRC
Dec 26, 2012 - predictive signal processing was the best method for monitoring ... of FEI or actual tube vibration, including a list of other methods (e.g., time domain ..... NAME. JRHali*. JBurkhardt. (SFigueroa for)*. DBroaddus. (BBenney for).

San Onofre Nuclear Generating Station, Unit 2 - Request for ... - NRC
Dec 26, 2012 - SCE's CAL response for SONGS Unit 2 and has determined that ... 4-12, appears to state that tube-to-tube wear (TIW) growth rates are .... of FEI or actual tube vibration, including a list of other methods (e.g., time domain.

Recent duplication and positive selection of the GAGE gene family.pdf
Recent duplication and positive selection of the GAGE gene family.pdf. Recent duplication and positive selection of the GAGE gene family.pdf. Open. Extract.

ethics Recent developments in gene transfer: risk and
Updated information and services can be found at: .... occupational hazards and risks to the public is .... Study design—trials should maximise their social utility.

Nuclear Receptor Signaling: A Home For Nuclear ...
Dec 15, 2014 - authors can be accessed from the journal home page at www.nrsignaling.org ... committed funds to building a dataset metadata repository – the ...

Going nuclear: gene family evolution and vertebrate ...
Jun 28, 2002 - Reconciled tree analysis of a database of 118 vertebrate gene families sup- ports a ... sequence data should produce the correct species tree.

Recent developments in gene transfer: risk and ethics
Fig 1 Number of gene transfer trials approved worldwide has increased since 1989; 77% have been conducted in ... depend heavily on postmarketing surveillance. The United Kingdom and Australia are exceptional ... and possible benefit, and overseeing r

Going nuclear: gene family evolution and vertebrate ...
Jun 28, 2002 - Reconciled tree analysis of a database of 118 vertebrate gene families sup- ports a largely ... phylogeny estimated from a set of gene sequences tells us something ... which are relatively large markers that have been thought.

Channeling the data deluge - The Centre for Gene Regulation ...
bio logy into a data-centric science, the best example being ... computer science tools and high-performance ... University of Dundee, Dundee, Scotland, UK.

Channeling the data deluge - The Centre for Gene Regulation ...
a sophisticated technology, but many open- source tools are ... expression data (MGED)4, open microscopy ... database servers and accessed through layers.

a model for generating learning objects from digital ...
7.2.9 Tools for generating Learning Objects. ....................................................... ... 9 Schedule of activities . ..... ones: Collaborative Notebook (Edelson et. al. 1995) ...

a model for generating learning objects from digital ...
In e-Learning and CSCL there is the necessity to develop technological tools that promote .... generating flexible, adaptable, open and personalized learning objects based on digital ... The languages for the structuring of data based on the Web. ...

A Simple Stochastic Model for Generating Broken ...
Jan 5, 2009 - cloud-top height retrieved from the Moderate Resolution Imaging Spectroradiometer (MODIS). .... data. Schmidt et al. (2007) used cloud fields simulated by the last ..... President's program Leading Scientific Schools (Grant.

A Proposed Approach for Generating Arabic from ...
(ana arghabu fi hagzi ghurfatun fardiyah). A major problem ... allows for a flexible integration of software modules for languages that differ in their realization of ...

A Domain Specific Language for Generating Tool ...
large number of heterogeneous tools is to manually translate, synchronize and update the data between the different tools, but it leads to development inef-.

Leveraging Contextual Cues for Generating Basketball Highlights
Permission to make digital or hard copies of part or all of this work for ... ums and gymnasiums of schools and colleges and provides ...... Florida State, 9th.

Leveraging Contextual Cues for Generating Basketball Highlights
most popular sport in the US (after Football and Baseball). [1]. Basketball games are .... of social media and blogging websites, researchers have turned their atten- ... social media blogs, aligning them with the broadcast videos and looking for ...

Leveraging Contextual Cues for Generating Basketball Highlights
[9] C. Liu, Q. Huang, S. Jiang, L. Xing, Q. Ye, and. W. Gao. A framework for flexible summarization of racquet sports video using multiple modalities. CVIU,.

Generating Sentences from a Continuous Space
May 12, 2016 - interpolate between the endpoint sentences. Be- cause the model is trained on fiction, including ro- mance novels, the topics are often rather ...

Electricity Generating
Dec 4, 2017 - จากการไม่มีก าลังการผลิตใหม่ๆในปี 2561 การเติบโตก าไรของ. บริษัทจึงไม่น่าตื่นเต้นมากนà