Insect Molecular Biology (1996) 5(3), 153-165

The insect cytochrome oxidase I gene: evolutionary patterns and conserved primers for phylogenetic studies

D. H. Lunt, D.-X. Zhang, J. M. Szymura* and 0. M. Hewltt

Introduction

Population Biology Sector, School of Biological Sciences, University of fast Anglia, Norwich

The study of mitochondrial DNA (mtDNA) sequences has become the method of choice in recent years for a wide range of taxonomic, population and evolutionary investigations in animals. Many aspects of the structure and evolution of mtDNA have made it a valuable evolutionary tool. These include its ease of isolation, high copy number, lack of recombination, conservation of sequence and structure across metazoa, and range of mutational rates in different regions of the molecule (reviewed by Moritz et al., 1987; Harrison, 1989; Simon, 1991; Wolstenholme, 1992a). The mitochondrial gene encoding subunit I of cytochrome oxidase (COI) possesses some extra characteristics which make it particularly suitable as a molecular marker for evolutionary studies. Firstly COI, as the terminal catalyst in the mitochondrial respiratory chain, has been relatively well studied at the biochemical level, and its size and structure appears to be conserved across all aerobic organisms investigated (Saraste, 1990). Mutational studies have been used to map the reaction centres of this subunit (Gennis, 1992) and these provide a background which enables interpretation of sequence differences in terms of gene function. Cytochrome oxidase I is involved in both electron transport and the associated translocation of protons across the membrane and it has been shown to contain a range of different types of functional domain including ligand sites, components of the proton channel, structural ahelices and interspersing hydrophilic loops (Saraste, 1990; Gennis, 1992). Amino acid residues in the reaction centres, which are highly conserved, do not dominate the entire COI molecule, allowing scope for considerable variability in some regions. Such a mix of highly conserved and variable regions so closely associated in a mitochondrial gene make the COI gene particularly useful for evolutionary studies. Secondly, the COI gene is the largest of the three mitochondria-encoded cytochrome oxidase subunits (composed of 511 amino acids in D. yakuba, compared

Abstract Insect mltochondrlal cytochrome oxidase I (COI) genes are used as a model to examlne the wlthlngene heterogeneity of evolutlonary rate and Its lmpllcations for evolutionary analyses. The complete sequence (1537 bp) of the meadow grasshopper (Chorthlppus parallelus) COI gene has been determined, and compared with eight other Insect COI genes at both the DNA and amino acid sequence levels. This reveals that different regions evolve at different rates, and the patterns of sequence varlablllty seems associated with functional constralnts on the protein. The COOH-terminal was found to be slgnlflcantly more variable than Internal loops (I), external loops (E), transmembrane helices (M) or the NH2 terminal. The central region of COI (MSM8) has lower levels of sequence variability, which Is related to several Important functional domains In thls reglon. Highly conserved primers which amplify regions of different variabilities have been designed to cover the entire insect COI gene. These primers have been shown to amplify COI in a wide range of species, representing all the major insect groups; some even In an arachnid. Implications of the observed evolutlonary pattern for phylogenetic analysis are dlscussed, wlth particular regard to the choice of regions of sultable variability for specific phylogenetic projects. Keywords: insect, Chorthlppus parallelus, cytochrome oxidase I, mltochondrlal DNA, conserved PCR primers, genetic marker.

'Present address: Department of Comparative Anatomy, Jagellonian University, Karasia 6,30460Krakow, Poland. Received 14 July 1995; accepted 3 November 1995. Correspondence: Professor G. M. Hewttt, Population Biology Sector, School of Biological Sciences, University of East Anglia, Norwich NR4 7TJ.

f, 1996 Blackwell Science Ltd

153

154

D. H. Luntet al.

to 228 for COll and 261 for COlll; Clary & Wolstenholme, 1985), and is one of the largest proteincoding genes in the metazoan mitochondrial genome. This enables one to amplify and sequence many more characters (nucleotides), within the same functional complex, than is possible for almost any other mitochondrial gene. A suitable genetic marker is an essential prerequisite for success in many evolutionary studies. The crucial characteristic in the choice of such a marker is the substitutional rate of the particular region. To a large degree it is the broad spectrum of substitutional rates which accounts for the popularity of animal mtDNA as a molecular tool, since it allows resolution of both intraspecific phylogenies (e.g. Avise et a/., 1987) and the higher level systematics of anciently diverged taxa (e.g. Ballard et a/., 1992). It is well known that different genes may evolve at different rates, and the same gene may have different rates of evolution in different lineages. However, within-gene heterogeneity of evolution rate has not yet received enough attention especially in the field of lower taxonomic level phylogenetic studies. It may be misleading for many applications to consider a gene as fast or slowly evolving, because this implies a homogeneity of rate across the whole gene, which is rarely true due to the concentration of functional constraints in specific regions of the DNA sequence. Hence it is highly advantageous to have information concerning the relative substitutional rates of different gene regions, as this will allow a much more informed choice of sequence for particular phylogenetic investigations. Sequences evolving too quickly are known to lose their ability to unambiguously reveal the phylogeny of anciently diverged taxa. Similarly, choice of a sequence which is too conserved when addressing questions of intraspecific phylogeography, for example, will not provide enough informative characters to determine the requisite relationships. Thus for many studies success will depend to a large degree on the sampling of a region containing a suitable level of variability. In this study we have used the COI gene as a model to study the within-gene heterogeneity of evolutionary rate and discuss this in terms of its implications to phylogenetic studies. The complete sequence of cytochrome oxidase I for the meadow grasshopper (Chorthippusparallelus) has been determined. (At the time of writing, no other Orthopteran COI sequence has previously been published, though the Locusta rnigratoria sequence, kindly provided by Paul Flook, University of Berne, Switzerland, prior to publication, is also included here; see Flook et a/., 1996.) Comparative analysis of this sequence with seven other

published insect COI sequences has been carried out in order to identify and quantify areas of differing levels of variability. The relative rates of evolution (at the amino acid level) of these areas are considered in the context of both the structurefunction model of COI and their utility in evolutionary studies. Areas of DNA sequence which are completely conserved are also shown and conserved primers designed and tested for their applicability to a wide range of insect species.

Results and Discussion Sequence and structure ofthe C. parallelus COI gene

The COI gene of Chorthippus parallelus has been completely sequenced and is presented in Fig. 1 together with an alignment of eight other insect species. This sequence significantly extends our knowledge on insect COI genes, because six of the eight sequences previously available were from the same Order, Diptera. The Chorthippus parallelus COI gene is flanked by tRNA-Tyr at the 5’ end and tRNALeu at the 3 end. Although the relocation of tRNA genes is not an uncommon event (Hauke & Gellisson, 1988; Paabo et a/., 1991; Smith et a/., 1993), Chorthippus shares its positioning of these genes with many of the other insects presented here (Beard eta/., 1993). The complete C. parallelus COI nucleotide sequence was found to be 69.4% A+T, comprised of 34.1% A, 15.5% C, 15.1% G and 35.3% T. The A + T percentage of nucleotides at the third codon position is much higher (90.8% AT) than either of the other two locations (first = 57.9%; second = 60.6%). With the exception of Apis, these values are typical of those reported for other insects (see Table 1). The putative initiation codon for C. parallelus COI is TCG and the stop codon is a single T at position 1537. Although the exact initiation and termination codons for insect COI genes are probably less clear than those for any other insect mitochondrial gene (Beard et al., 1993) the codons suggested here are wholly consistent with those reported elsewhere (see Table 1). Beard et a/. (1993) discuss the possibility that the initiation codon for D. yakuba is the TCG triplet immediately following the ATAA which is more usually recognized. Although TCG would fit well with many of the other insect sequences presented here, it is not supported by the other Drosophila sequences. Both D. simulans and D. sechellia share the ATAA motif but have undergone substitutions which alter the following TCG (Ser) triplet to CCG (Pro). The 5’ COI sequence of two other Drosophila species have been reported by Satta et a/. (1987). These species commence with the tetranucleotides ATAA (D. rnelanoga-

0 1996 Blackwell Science Ltd, lnsect Molecular Biology5: 153-165

lnsect COI gene evolution and conserved primers

155

50

I

I

I

I

I

.............. ---................... ---................

100

I

I

I

I

I

t c g C C G C A A A A A n ; A T A ~ ~ ~ Chorth attaA.. C...........C.....C.....T..AC.G..T........C..G........T.....A.................A Locuata tcg.G.C T.................T.....T.....T..C.....T.....T.....A.....G.....TT....A A. gambiae tcg.G.C T....................A.....T.....T........T...........A........T..TC....T A. quadrim ataaT...G.C....G........T.....T.....A...........T.....T..C..T........T.....C.....A...........TT....A D. yakuba A..C.........T....A D.aechel1 ataa....G.C....G........T.....T.....A...........T.....T.....C........C.....T. ataa....G.C.............T.....T.....A....,......T...,.T.....T........T T.....A..C.........T....A D.simul tcg.G.C T..T..T.....A........T..T.....T..C..T..C.....T...T.C.....A........T..TP....A Phormia ataAT...G.....CA.A.....C..T.....AA.......G.T...G..TA....TC.A.~....T.T.....AC.....T.......G..A Apia

mZ

+-..-.......-....

+

I

I

I

I

I

I

I

I

zoo

I

I

I

I

I

A A ~ ~ ~ C A A A A A n ; A T M ~ A ~ ' Chorth Locuata A..T...T.A..T........AA.AA.A...AAC.........C.A........A...........A..C........T.......... A. gambiae CC.A. A..T A.. C..T..AG.AT C.................A...G....T........T...A.T.......... A. quadrim ..TT. A.....A......T.A.....C..T..AG..T.......T.......................n;.A...........T...A.T........T. ..TT.A.....A......T.A..T..T.....AG.AT.A..........................A..n;....T..A.....T...A.T........T. D yakuba .C..T..AG.AT.A..........................A..n;.A..T..A.....T...A.T........T. D aechell ..TT. A.....A..C...T.A..T. D.aimu1 ..TT. A.....A..C...T.A..T..T..T..AG.AT.A..........................A..TG.A..T..A.....T...A.T........T. CC.A.....A..C...T.A..G..C..T..AG.A..A...........C..............A..n;.A..G.....C..T...A.T.......... Phormia Apia C.T........AAT....T.AA..TCC.....A..ATGA...A.CA.................ACA..n;....TAG.........CC.A........T. El . - . - - - - . . . - - - . . . - . - - - - - - . . - - - - - - - x .... .. -. .. -.. .. .. .- .

..

~

--...--------.----.----...-..-.--.-~-.---.-.___.----_...~---~~..

150

I

~

..... ....

.............

----

----

~

........... .. .... ..... ...

L

T

..........

..

f-----....------------.----..----

250

I

I

I

I

I

300

I

~ A T ~ T A C C T A T T A T A A ~ ~ M ~ ~ ~ C A T P A A T A A ~ ~ T A T A ~ C A C Chorth .C..G..T..G..A...........A.....C..A......T...................A..T..................... Locuata T.................A..G.....A..C...T....T..T......T.A..A.....T..............T...........T..... A. gambiae T.................A........A...........C..TC.....C.A.....C........G.....C..T...........T..... A.quadrb A.................G..G.....A......T....G..T......T.A..A..T..T..C.....A..C..............T..... D. yakuba T.......................C..A......T.......T......T.A..T..T..T........A..C..............T..... D.aechel1 T.......................C..A......T.......T......T.A..T..T..T........A.................T..... D.aimul A.....G...........A........A...........T..CC.T...T.A.....T..T........A..C..T...........T..... Phormia T.....AT..T.......A........A.....G..TA.T..T......C.A..AT....T... A..C..C... T..T.. Apia ............................................................. 11 .................................

..............

....... ....... ....... ....... ....... ....... .......

.....

........

350

Chorth Locuata A. gambiae A.quadrim D.yakuba D aechell D.aimul Phonnia Apia

.

I I I 'C~~~~ATM~TAA~~~~CAAAAAn;~C~~~C~ I

I

I

I

400

I

I

I

A.........T................A..CC..C.AATG..T...G..G.A........A..T..T..............A..T........A...A.. A......A.G..T..T..T........A........TTIT.TA6AG....G.A..A..C..G..T..............T.....T.....T..AT.TPCP A......A..T....C..C..T.....T...C....TI.TAG.AG....G.A..A.....A..T........G.....T..A..T.....T..AT..TC. A..............T..TG.TC.TT.TT.A...T.A.T.AG.AGA...G.T..A..C..A..T..T.....T.....T...........TT.AT.~. A..............C..n;.TC.W.TT.A...T.A.T.AG.AGA...G.T..A..C..A..T..............T.....T..A..TT.AT.T.CT A..............C..TG.TC.'LT.TT.A...T.A.T.AG.AGA...G.T..A..C..A..T..............T.....T..A..~.AT.T.CP C...C.TP....T..n;.......TT.A...T.A.T.AL;TAG....G.A..A.....G..T..............T........A..TT.AT.TPCP A...........T..T..C..... .mA.AC.TP.ATT.AG.AA.T..T.TT..CCAA.AC.......T...........A..T..A...T.AT.A.C. ...................MJ ---....----....-..-..-------~-.---------.----.... EZ .......................

...

450

I

I

I

I

I

500

I

I

I

I

I

G C A A T n ; C C C A n ; O T G C T ~ A ~ T T ~ T C T ~ C ~ ~ A ~ A ~ O T ~ ~ A T T A C Chorth .TC..... T...A.A ...GCT.. T......T A.....T..............A........C..A...........TA.T..T..C.......... Locuata A. gambiae .G......T....C....GCT.........T....A.....T..TC.T...T....A..AA....T....................T............. .G......T....CA...GCT.........T..........T.........T....A..AA....T..A..............T..T............. A.quadrM .OT.. C..T......... GCT.. T......T.. T..TC.T...T....T..AA....T..A...........T.....T........G..T. D yakuba T. D.aechel1 .G................GCP.....T...T..........T..T......T....T..AA....T..A...C.......T.....T........ .G.........C......GCT.....T...T..........T..T......T....T..AA....T..A...C.......T.....T...........T. D.aimul AAT. T..C..A...GC...T..T...T.............TC.T..CP.G.....AA....T..A...........T.....T..C........T. Phormia TATT.ATAT ...TC.TC. Cff.........T.T..A.....T..TC.T...A..T.A..AA....C..A...A.....T..T.......A..ADPT...A Apia ................... EZ --......-------.-.-----------.-.-.------.--. M4 ...........................

....

........

.

...

....

550

I Chorth Locuata A.gambiae A.quadrim

D.yakuba D. sechell D.simul Phormia Apia

I

I

I

I

600

I

I

C

A T T ~ T A ~ ~ ~ T ~ ~ ~ ~ C A ....AC. .... A..A...A.T.AT.....C..T..............A................C.....C...CP.A...C.T..AT............. T.........A.....TCC.G....T...T.A....G..T......A...........G.....T....C....G.A........AT............ TT........

~

.........

~

I A

O

I T

A

A

T

T

A..AG..CC.G....T..T..T..C.G..T...G..A..C........T. C....G.A........A..T..T........ T.. A..A...ACTG. T...T.A..C.G..T T..A.....A...........T....CT..TT.A...C.T..A.....T........ T. A..A...AC.G....TT..T.A....DI.T...T..A.. T....CT..CP.A......C..T.......G..... T. A. AC.G ....TP.. T.A....DI.T T..A.................T....CT..TT.A......C.AT............. T.. A..A..TAC.G....T...T.T....G..T...T..A...........T.....T....CT..T......C.T..AT....T........ TP.. A.TA..AAAAAA~....ATTAT..C....CAAAAAn;....A...CCA........TT.T....C....A........A.TA..........T.. ..-.-...-....-----------1.2 .................

....... ........

........... .. .......

...

...

...

Flgure 1.

01996 Blackwell Science Ltd, lnsect Molecular Biology 5: 153-165

...............

I

~

.

~

A

T

156

D. H. L unf et al. 700

650

IC

Chorth

IG

I U

I C

I T

I A

m A G A G A G U L T A T A T P A.....T................................CC.T.....G........C..C..G..A.....A..T.........T.A..T.....C... A. gambiae A...........T......................................-T..........,...A.....A..T.~T......T.A..T.....C... A.quadrim A.....T.....T.........C.T.....A........C....................C.,......G..A..G..A..C.A.T.A.,......C,.. D.yakuba .C.T..C.....T.................A..C..............T..T..T.................A.....T..T...T.G............. D sechell A.....G.....T.................A.......................T.....C......,....A........T...T.A............ D. a h 1 A.....G.....T.................A.......................T.....C.........,.A,.......T...T.A............ Phormia A........T..T....................C..............T.....T.................A........T...T.A............ Apis A.....T...............C.....TP............T...........T..C.....TATA...........T...........T.........

I A

~

~

I~

~

~

IT

~

~

IC A ~ T

T

Locuelta

.

- ~ - . - - - - - - - - - - - ~ - - - - - - - - - - - - - - - - - - - - - - - - E3 - - ---- ---.. . -.-- - - - - . - - - - . - - - - - - - - - - - - - - - -

750

chorth Loculrta

A. gambiae A.qU.¶dKim D.yakuba D.sechel1 D.simul Phormia Apia

I

I

I

I

I

800

I

I

I

I

.............. T..C..A..................T......................A... ..C..T...........A...,............ .. C.....T.....T... .. A.....A............T.. .. T.....C..A..A.....G. ..... A.TAC...............G.AG...A... ........ T.. ...... .... T......... ....... A............A.TACA........A..T....AG...A... ........ T.....T..C................. ....TT........GT...... ...... ....AA....AA................ A.TA.A......TC...T....AG...A.T. ........ T.....T..... ................... ........A.TA.A......TCG. ......AG...A.T. ........ T.....T............ ............ T....T........G..A. ........... A.TA.A......TCA..G....AG...A.T. ........... . ....A............T....T.....C..A..A............A.TA........TCA.....G.AG...A.T. ........T.....CT....T..C...... A.... ..............T....T ........AT.A..C............ATAA.T. ....A.......AA...A~. C.................

..................... M6

I3

- - - - - - . - - - - - - - - - - - - - - - - - - - -. . . . . . . . . . . . . . . . . . . . . . . . . ---------

850

Chorth Locusta A. gambiae A. quadrim

. .

D yakuba D.sechel1 D s h l Phormia Apis

chorth

I

T T T K A ~ ~ A T A " T I ' A A l T C l ' A C C A G G A ~ A T P A T P K T C A T A 1 T T P I U TTGAATCAT

T .......AA.

I

I

I

I

I

900

I

I

I

I

I

I

I

I

I

I

I

I

I

~ T A T.............................A........A..............T.....C...........A..A........A..... ......AC..... A.....C.....T...C..G.......C...C.T..A........T.....T....................A.............. .C....AC.....A...........T...C..G.T.....A...C....G........T.....T.....C...........A..A.............. .C..TT.......A.....C.....T...C.TG.T.....A...T....A........T.....T....................A.............. ....TT....... A...........T......G.T.....A...T....A..............T...........C.....A..A........A..... A...........T......G.T.....A...T....A..............T...........C.....C..G........A..... .C...T.......A........C..T......G.T.....A...T....A........T.....T...........C..C..A..A.............. T.A....A.A..................GC......A..TC.............T...........C...........C..AT....T........ .............I3 - - - - - - - - - - - - - - - - - - - - - - -M7 ---. -~ -..~ -~ -~ - .. -- . ~ . . ~ - ~ - - - . ~

....~....... .... I

I

I

I

I

I

I

I

950

I

1000

I

~ ~ T ~ ~ ~ ~ T A T..T..C..C.....A..............T.................G.....C......A.......AT.A..C.....T.....C..A..C Iocumta A. gambiae T.....T..............A..T...........T..G.....T........WL.T.....T...T....T..AT.AC.C......C....G.C..A. A.quadrim T.....T..T.....T..T..T..T...........T.....T.....T......A.T.....T...T.......AA.AC....T...C....A.C..A. D.yakuba T..T.....T..T..T..T...........G..T..T............A.T.........T....T...T.AC.......TC..C..TC..A. T..T..C..T.....T..T..............T..T..T..T......A.T.....C...T....T...T.AC.......TC..C..TC..A. D.sechel1 T..T..C..T.....T..T..............T..T..T..T......A.T.....C...T....T...T.AC.......TC..C..TC..A. D.elhl Phormia T.....T........T....................T..T.....T.........A.T.....T.......................TC....A....AC Apia T........T.....T.....A........C.....T....................T.........T........TA.C....~.......A..A..A . . Ed . . . . . . . . . . . . . . . . . . . . . . . . . . - - - - - . . - - - - - - . - - - . - - - - . - - - - -_ -_ -_ -_ -. _ -_ ... ..~ -

...... ...... ...... ......

1050

Chorth

A

I

1100

A

~ ~ ~ ~ ~ ~ A ~ T.A.....AT.A...........C...........A..A........T..TC.TG.AC.T.....................G.T.... Locueta A. gambiae .GC... G.TA T.G.....AT... CG....CT........G........C. G.TC.AC....T.........A.T......G.TC.T. A.quadrim .G...TG..A..T.A......T.C......G.A...T........G.A..A........C..TG..G.A.....T........AA.T......G.T.. D. yakuba TC.... G.TA.TP.A......T.A......G.....T....C...G.A..A..A.........G.TG.A.....T........AG.T........T.... D sechell TC.... G.TA.TP.A......T.A......G.A...T....C...G.A..A..A...... G.n;.......T........AG.......C..T.... D.simul TC.... G.TA.TP.A......T.A......G.A...T....C...G.A..A..A.........G.n;.... T........AG..........T.... Phoda TCC...G.TACTT.A.. T.A. G.A...T.......n;.A..A..A.....T...G .n;..C....T..C..T...A.T........T.... A~..A.TP.A...T.A..A..T......A.A.... T.....b..A.........A.T..A...T.......T...A.T........TC.T. Apia

..............

.

.... .....

...

....... ...

....

I A

~

..

...

.... ng

E5

------.---------------------------------------_ _ -_ -_ ._ -_ -. -_ - .-

I ChOrth Locu4ta

A. gambiae A.qu4IdZ-h D.yakuba D. sachell D.aimu1 Phoda Apis

I

I

I

1150

I

1200

I

I

I

I

I

A

~ T ~ A ~ ~ ~ ~ ~ T ~ A T ~ ~ A T ~ A T C..A........A..............C..T..AT..................................A..............C-.T........ .C.................T..T...........T..AT....A.................T.......C...AT..G.A..T..............A.. A........A..T..C.....T.....T..(;T..........G.....C.....T.......C...DT.......T.....C......C.A.. A. T... C......T....A........T........T..,....C....T.......C.....C........... A.....T.....T.....T...T,...A........T........T.......C....T.......T.....C-.T....... A.....T.....T.....T...T....A........T........T.......C....T.......T.....C..T....... .C.....A........A.....T........C..T..AT....A........T..C.....T.......CT..AT..G....T....TC..T........ A.....C.....T.G......T.....T.....T..A.....T.................~.A.AT.......T............A... -----------------------M1O ---. --.---- --. ----.15 -_._..-------------..

....

....... ................ ................ .... .....

. .

................ .......

.

Flgurr 1 (continued)

0 1996 Blackwell Sclence Ltd, lnsect Molecular Biologys: 153-165

lnsect COI gene evolution and conserved primers

157

1250

Chorth Locusta A. gambiae A.quadrim D.yakuba D.sechel1 D simul Phormia Apia

.

1300

I

I

~ ~ T.......G..A....C....A..T.....T........................T.......C........................... T...............CCA..T........A..T.....TT...........G.A........TT...................T...T....T.....C T.....CCC..AT....... T.......TG.A..A......G..........TT....C.....C........T..CT....G..G..C T.....G...T.........A...G.....A.GT.....T.T......G..............TT......... .C..C.....T...T........... T.........T.........A.........A.OI.....T.T..C..................TT..........C........T...T..........T T.........T.........A.........A.GT.....T.T.....................TT..........C........T...T..........T T...C.....T.......C.AG.T......A.GT.....n;................T.....TT....C..C..C........T..CT..........T T ......TP. T......T A.........A..T.....T.T...A.................T.....T..C...........T...T.....C..AT

~

I

I

I

I

I

I

I

I

A

......... .........

..

..

.

*-

_~........_..._. .------------......~....--.--..---..---...... M11 .......................

1350

I

I

I

I

I

1400

I

I

Chorth G G A A T A C C A C G A C G C A ~ ~ T A T C C A T..C..T.................T..C..C..A........A...........CAC......T....... -eta A. qambiae T...........A..C.T.......AGC...TT.A...........G.....TC.T.A..TAG......C...T.AT.C.CT..TT.ATAC. A.quadrim T........C..A....T.......~..CTT.G........A.TG.A..ATC.T.A...AGA........TP.AT.T.CT..TP.ATAC. D.yakuba T.....T.....A.....C..T.....T..C..TA..........n;.G....C......G..A..T......T.AT.......TP.AT.T. D.scchel1 G.....T...........A..............T..C...A........A.TG.G....C.........A..T......T.AT.......TC.AT.C. D simul T........C..A..............T..C...A........A.TG.G....C.........A..T......T.AT.......TC.AT.C. T........C.....C...........T..C...G.T................C......T..A.........T.AC.......TP.AT.T. Phonnia Apia TCT...........T.....A..C.........T.T...TAC.GT......TC......ATC...A.....A.T.......T.AAATA....A...T.T. ...................... E6 ..................................................

.............................

I

G

A

T

I

G

C

A

I

T

A

T

........ ........

.

........ .. ........ ........

.................

1450

I

I

I

I

I

1500

I

I

I

I

I

T T A T C C P A A T P T P A T G R G A A A W V L T A A T A ~ ~ T A A T ~ T ~ ~ ~ ~ ~ T T chorth Locusta .C..TT.....A..................AGC.A....ATG.AT.TA..T...CA...................................T..A..A.. A. qambiae T.AT.T...A.T........T.....C.CTC.A.....TC~..CCCT.TAC.AT..TC.TC.....TT.......AC..T.CC~..T..... A.quadrim T.AT.T...A.T........T.......C.C.A......CCAGC...CCC..TAC.AC.TPCTLY:.....TT..G....AC..C.CCCTA..C..... D.yakuba T..TAT...A.T........TT..G.~.A...CA.G.A..T.A.CC..T.C.AT...ATPC...T.TP.......AT.......CA..C..A.. D.sechel1 T.TP.T...A.T........~..G.ATC.C.A..CCA.G.A..T.ACCC..T.C.AT...ATPC...T.TT.......AC.......CP..C..A-D.simul T.TT.T...A.T........W..G.ATC.C.A..GCA.G.A..T.A.CC..T.C.AT...ATLY:...T.TT.......AC.......CP..A..A-Phormia .CP.TP. C...A.T........TT..G.ATC.C.A..TCA.G..T....CCCTGPCC.AT...ATPC.....TT................eP..A..A.. Apis .A..TP.T...A.T.T.......T.....TCI..A....T.T.AT......A.TPC..CCA.T---C....CIT.........A.TIT.TTA..A...CT ..................................................... Coo" ___.._._....-....-...-.....~--

.. .. .. .. ..

1550

Chorth Locusta A.qambiae A.quadrim D.yakuba D.sechell D.s-1 Phormia Apia

I I I I AGAACATAGATATPCAGRACTACCAATARmCPAGRt-----------------------------I

1568

I

I

...... C.....C............C.....AA.TTATAGAt-------------------------........CT... G....G..T..TC..T.AA...ATAACt---------------------------

........................... .................................................................... ....................................................................

........CT... G.....T.....T..T.ACPA.TAATTt T..............T...T.....C.TT.AA.A.ATtaa----------------------------

T.....C..C...AGT...T....TT..T.AA...ACTPCtaa-------------------------

...T...TC.C...T....A.T...T..T.AAT..AAAAmAAAmAAAATCAAmTAATPAAAt COOH ..........................

.............................

*

Figurel. DNAalignment ofthe COI genefor ninespeciesof insect. Identity to C.parallelusis indicated byaperiod, adeletion by adash. (Putative) termination and initiation codons are displayed in lower case. The twenty-five structural regions are indicated (see text and Fig. 3 for details).

sfer) and GTAA (D. rnauritiana), indicating that differences in initiation codon can be present even between sibling species. The termination codons used for insect COI genes are shown in Table 1. Many organisms terminate with a single T, or TA, immediately adjacent to the tRNA gene, and it is known that complete (TAA) termination codons can be produced by post-transcriptional polyadenylation (Ojala et a/., 1981). Drosophila and Phorrnia, however, show the complete TAA termination codon common in many other mitochondrial genes. Translation of this sequence with the invertebrate mitochondrial genetic code (Clary & Wolstenholme, 1985) gives a protein sequence with a mixture of residues conserved across all the studied species and residue positions of differing levels of variability [see

r; 1996 Blackwell Science Ltd, lnsect Molecular Biology5: 153-165

below and Fig. 2. Note that the first amino acid residues for all COI proteins discussed here are defined as methionines regardless of the initiation codons, as suggested by Wolstenholme (1992b)l. No insertion/ deletion events are apparent between C. parallelus and eight of the nine other insects. Apis, however, shows a deletion of 3 bp at position 1464 (Fig. 1). The position indicated here differs by three codons from that given by Crozier & Crozier (1993) but seems to give a better overall alignment. This deletion falls within the COOH-terminal region and may not be constrained with respect to either size, structure or amino acid function to the same extent as would a deletion of non-terminal residues. These observations agree with the expectation of conservation of size and structure in functionally constrained systems. Figure 3

158

D. H. Luntet al.

T a b 1. Summary of data concerningthe COI gene for nine species of insect. The sequencesfor D. simulans and D. sechellia terminate prematurely Ref

Organism

Order

lnit codon

Term codon

Length (bp)

% AT

1 2

C. parallelus Locusta migratoria Anopheles gambiae A,quadrimaculatis Orosophila yakuba D. sechellia D. simulans Phormia regina Apis mellitera

Orthoptera Orthoptera Dlptera Dlptera Dlptera Diptera Diptera Diptera Hymenoptera

TCG ATTA TCG TCG ATAA ATAA ATAA TCG ATA

T T T T TAA

1537 1542 1537 1537 1540 ( 1498) (1498) 1539 1561

69.4 69.1 68.6 68.1 69.9 70.2 70.7 68.3 75.9

3

4 5 6 7 8

9

TAA T

Accessior, number

X80245 L20934 LO4272 X03240 M57908 M57911 L14946 LO6178

References: (1) This paper: (2) Rook etal., 1988; (3) Beard etal., 1993 (4) Mitchell etal., 1993; (5) Clary 8 Wolstenholme, 1985; (6 and 7) Satta 8 Takahata. 1990; (8) Sperling e t a / . , EMBL database access L14946 (unpublished): (9) Crozier B Crozier, 1993.

insect COll gene the COOH terminal also appears to be the most variable region. Pooling the data into classes in this way however will lose much information if there are large differences within these classes. Figures 4 and 5 show the mean variability per region (individual loops, transmembrane stretches or terminal regions) and there can be seen to be large differences in the mean variability of different regions of the same structural class. Transmembrane helices M2, M6, M7 and M10 provide the metal ligands to interact with the two haem groups and copper atom which are essential for the activity of COI (Gennis, 1992). These regions can be seen to account for four of the seven highly conserved transmembrane helices. The fifth of these conserved transmembrane helices is M8, which is suggested to be involved with the cytochrome oxidase proton-conduction channel. This region contains three polar residues (Thr-352, Thr-359 and Lys-362) which are completely conserved among all organisms so far studied, and which are thought to be essential for this translocation activity (Gennis, 1992). Transmembrane helices M5 and M11 are also very conserved. External loops E3, E5, and especially E4,seem to be very conserved (Fig. 4). Although the functional role played by these interhelical loops is unclear, E5 is thought to lie very close to heme-A in the association to which Tyr-414 has been suggested to play an important role (Holm etal., 1987; Gennis, 1992).

shows a two-dimensional structural model of the COI protein, with functionally essential (boxed) and variable (filled circles) residues in insects highlighted, respectively. It is clear that the distribution of variable residues is not random along the molecule (see below for further discussion).

Mode and tempo of evolution of the insect COI gene

The COI amino acid sequence was divided into twenty-five regions comprising five structural classes [twelve transmembrane helices (Ml-M12), six external loops (El-E6), five internal loops (WE), carboxy (COOH) and amino (NH2) terminals] as shown in Figs 2 and 3. In order to test the null hypothesis that there is no difference between the average amino acid variability, per site, between the five structural classes, a Kruskal-Wallis analysis was employed which led us to reject this hypothesis with an associated probability level of <0.0001. When the average variability per residue site was calculated for the five structural classes the COOH terminal was found to be significantly more variable than any other region (<5% significance level). The observed mean levels of variability were not significantly different between the amino terminal, internal loops, external loops or transmembrane regions (Table 2). This difference reflects the highly variable nature of COOH terminal amino acid sequences and agrees with Liu & Bechenbach (1992) who report that for an alignment of the

Table 2. Mean variability, per amino acid position, for the different structural classes 01 COI.

Region NH2terminal Internal loops External loops Transmembrane COOH terminal

Size (amino acids) 13 103 120 232 30

Standard deviation

SE mean

Mean variability

0.947

0.263 0.089 0.072 0.050 0.239

1.692 1.689 1.467 1.461 2.500

0.908 0.788 0.755 1.306

0 1996 Blackwell Science Ltd, lnsect M o l e c u l a r Wiology5: 153-165

Insect COI gene evolution and conserved primers 50 Chorthip Locusta A. gambiae A.quadrim D yakuba D.sechel1 D.simu1 Phormia Apis

.

I

I

I

I

159

H102

100

I

I

I

I

I

~ ~ S T N H K D I ~ Y E ~ W ~ ~ S H S M I I R A E ~ ~ S L I G D D Q I Y I I T A H A F V n I F F ~ I M I G F G ~..........................................TH. N L................................................ M-RQ.. I.. L.IL......H..AF..........V.....I......................L.............. M-RQ I...........L.IL......H..~..........V.....I......................L.............R MSRQ. I...........L.IL......H..A...........V.....I................. L.............. M.RQ I...........L.IL......H..A...........V.....I......................L.............. M.RQ I...........L.IL......H..A...........V.....I.. L.............. M-RQ I....S......L.IL......H A...........V.....I......................L.............. M-M m.....N..I..IILAL.S..L.S..RL...M..RS...W.SN.....T.V.S...L........FL........I...L.S..........IR *-... NH2 . - . f c . - . . . .......*......... El ... . . . . . . u . . . .&Q. .. . . . . . . . u - . - .I1 . ......... . U-

...

............. ......... ............... .............. ............... ............... ...............

..... ....................

..

..

150

I

I

I

I

I

200

I

I

I

I

I

FSLHLAGVSSILGAVNFITTAINMRSESMTLGQTPLFVWWVI~LPV Chorthip FWI.UPSLTLLIASSMMDNGAGTGWWYPPGADr(rWrVYPPLAGAIAHGGLAI W...............SV...S.A....................I...........NN.............A.T.......... Locusta M.........S...VE.............SSG...A.A.............I............V.....PGI...RM.........T.V........ A.gambiae M SR..VE.............SSG...A.A.............I............V....APGI...RM.........T.V. A.quadrim A.S..LV...VE.............SSG.....A.............I............V.....n;I...RH.........T.......... D yakuba A . S . . L V . . . VE.............SAG.....A.............I............V.....n;IS..Rn.........T.......... D. sechell A.S..LV...VE.............SAG.....A........ I............V.....n;IS..RH.........T.......... D.simu1 A....LV...VE.............SSN.....A.............I............V.....n;I.F.RM.........T.......... Phormia ........FPI LLRNLFYPRP..........SAYLY.SSP...F.. ....HS. I...M.SL.~.IHH.IWP..NY..IS..P...F.T.I..IM.... Apis ...... M3 ---...-.----.-...... E2 .......---..-tf---... M4 --.-..-----.. 12 - - - - - - - - -M5 . .. . --

.

...........~.. .. .. ......... ......

...... ...... ......

.......

.....

.

I

I

I

H239

*I

250

I

H333,H334

I

I

I

**

300

I

Chorthip ~ ~ ~ T D ~ S F F D P A ~ D P I L Y Q H L ~ F G ~ ~ I L I ~ ~ I I S H I V C Q E S ~ I E S F ~ I.............................. ~ocusta M....IT.....K.T..N........A...L........ A. gambiae E . N . . . . . . . . . . . . M . . . . I T . . R . . K . T . . N . . . . . . . . A . . . L......... A.quadrim M....IS.....K.T..S........A...L.................. D yakuba N....IS.....K.T..S A...L............. D.sechel1 M....IS.....K.T..S........A...L.................. D.simul M....IS.....K.T..S........A...L.................. Phormia F F.......M.............................L.....MN.R..K.I..N.R......G..n.............L.... Apis -...u~.-.~3. .-.... - . . . . - u - . M6 -.. ---....--.--...-... .I3 - - - - - - - - - - - f t - -M7 ------ - - - - u - -

.

..................................................................... ................................................... ......................... ........... ................................................... ................................................... ................................................... ...................................................

..........

.........

.....

........

......... ... T352

T359

* I

K362

350

* I

I

I

I

Y414 H419 H421

* *

I

400

I

I

I

I

I

GI IQWYPLFT Chorthip ~ Y ~ S A ~ I I A V P T G I K V F S W L A T L Y O I X P " P P L L W A U ; F I n E T I G ILHDmWAHFHYVLSHGAVFAIMGGI M........ K................M......V........V................................... Locusta I. H..QLTYS .AM... F..V....V.....W.....I..V........................A.FVH....L. A. gambiae I.. ....~..Q LTYS.AM...F..V....V.....W.....I..V........................A.F.H....L. A.quadrim I.......H..QLSYS.AI......V....V.....W.....V...........................A.F.H...... D yakuba I.. H..QLSYS.AI.. V....V.....W.....V...........................A.F.H...... D.sechel1 I.......H..QLSYS.AI......V....V.....W.....V...........................A.F.H...... D.simul I . . . . . . . . . . Q L . Y S . A T . . . . . . V . . . . V . . . . . W . . . ..I...........................A.FVH.F.... Phormia R....YH.S.~.ISI..S....M.........IM.S...I...........G..............ISRF.H....I. Apia -.----..Ma . ..-.----.14 - - - - - - - .M9 ---..-..&---.-E5 - - - - - U - - -MI0 - - - - - - - - - U I5 - - ..

.

...................... .................. ...... .................. .................. .................. ..... .................. .................. ....................

....

450

I

I

I

I

I

500

I

I

I

Chorthip ~ ~ ~ I H F I G V N L r r F P Q H F L G L A G W R R Y S D Y P I S I V G I I W I L I L W E S M I M N R T I H F S N S ~ ~ P A Locusta I......... T........M.....KQ.~I~............... A. gambiae P....I..S...V.......................F..S.LT...V..L.....LFA.LY.LF.I.....TQ..PA.PnQL...I..YHTL... A.quadrim PU L..An..V.......................F..S.LA..IV..L.R...LFA.LY.LF.I.....TQ..PA.PPLQL...I..YHTL... D. yakuba L..K...S..I..... T...V.T......LL..LF.M.I...LVSQ.QVIYPIQLN..I..Y..T... D. sechell L..K...S..IT................................T..IV.T......LL..LF.FF.I...LVSQ.QVIYPIQLEI..I..Y..T..D.0imul L..K...S..I.................................T..IV.T......LL..LF.FF.I...LVSQ.QVIYPIQLN..I..Y..T..Phormia L..n..S..A........ A. T.. ....LL..LF.FF. I. LVSQ.QVL.WQ LN.. I.....T... Apia ..LL. IK I..M.................HS..........S.YC..S...M..M..LNRM.n.F.IL.RL.SK.M.~.Q..-L...Nn..L -----15----------- -----------..E6 -....---.-.. ---.-.-..---.-..... coon ............

.....~... .....

..... ... ... ... ... ... ...

.........................................

............................ .........................

....

..

522 Chorthip Locusts

I

Emm=S-----------

I

.. S.S...L.NIr--------........urn--------........LLrn--------.. S.S...LLrn--------......................

A. gambiae A.quadrim D.yakuba D.sechel1 D.aimul Phormia s.s...LLrn--------D.SHL.I.LLIK"LKSIL1K Apis

..

.......

coon - - - - - - - - *

Figure 2. Alignment of COI amino acid sequences for nine species of insect. Identity to C. parallelussequenceis denoted by period, a deletion by a dash. Asterisks denote universally conserved residues, or those with functional significance discussed in the text. The twenty-five structural regions are indicated (see text and Fig. 3 for details).Note that the first amino acid residues for all COI proteins discussed here are defined as methionines regardlessof the initiation codons. as suggestedby Wolstenholme (1992).

01996 Blackwell Science Ltd,.lnsectMolecular BiOlOgY5: 153-165

d

i

4

Insect COI gene evolution and conserved primers !7SO

~

PI

Flgure4. A graph showing the mean amino acid variabilityfor the twenty-five structural regions of the insect COI gene. The variability is expressed as the average number of amino acids per site observed in a given region.

Internal loop 1 is highly conserved, in contrast to the other four internal loops which are relatively variable (Fig. 4). This pattern of conservation is also apparent in a similar alignment of twenty-two animal COI genes (unpublished data). A functional role for internal loop 1 is as yet unclear from current structural models of COI, though this data indicates that its function may be relatively important. The foregoing discussion of protein variability begs the question: does amino acid variability reflect DNA variability? In this situation it probably does. The evidence for this assertion is twofold. Firstly, a certain correspondence between amino acid substitutions and nucleotide substitutions is expected despite of the degeneracy of the genetic code, since it is DNA sequence which determines amino acid sequence. (However, the form of this relationship does not follow a simple linear function, as the degenerate nature of

161

genetic codes makes it possible for DNA variation to either match or outweigh protein variation, and it also falls under the influence of the phylogenetic distances of organisms compared.) Secondly, the relationship between patterns of amino acid variability and nucleotide variability can clearly be seen by comparing Figs 1 and 2, with the COOH terminal coding region being the most variable part, followed by regions coding for El, M3, E2, 12, 14, M9 and M12 structural regions. Finally, there is empirical evidence for the level of protein variability being a good predictor of DNA variability. The distribution of polymorphic sites in the COI gene between C. parallelus individuals in a population survey (Lunt, 1994) has been observed to match very closely to the variability predictions made by this study. Furthermore, Howland & Hewitt (1995) have sequenced 400 bp of the COI gene in Coleopteran species with a variety of divergence levels. Their results show adjacent regions of DNA sequence conservation and variability, which correspond to the loop and transmembrane structures in exactly the same pattern as reported here. These studies show that the expectations of shared patterns of DNA and protein variability are well founded, and that the regions identified in this study will almost certainly express the expected level of variability in most organisms. Given this pattern of conservation of sequence and structure, one is able to choose a level of variability to suit one's particular phylogenetic investigation. Conserved primers and their potential for use in evolutionary studies

Primers which are conserved across many insect groups would be extremely useful in many types of evolutionary studies. The variability observed between insect COI sequences has been quantified by position and conserved primers designed to exploit

Table 3. Details of the ten conserved primers designed to cover the whole insect COI gene. Positions given are those in Fig. 1. Primers UEAl and UEAlO are found in the tRNA-Tyr and tRNA-Leu genes respectively. Their positions relative to the COI gene (in the D. .yakuba sequence; Clary B Wolstenholme, 1985) are 56 bp (Tyr) and + 8 bp (Leu). Prlmer

Strand

Position (3' base)

Length (bp)

Sequence

UEAl UEAZ UEAl UEA4 UEA5 UEA6 UEA7 UEA8 UEA9 UEA10

Sense Antisense Sense Antisense Sense Antisense Sense Antisense Sense Antisense

tRNATyr 375 294 618 62 1 966 900 1266 1284 tRNALeu

26 26 24 24 24 29 24 24 26 25

GAATAATTCCCATAAATAGATTTACA TCAAGATAAAGGAGGATAAACAGTTC TATAGCATTCCCACGAATAAATAA AATTTCGGTCAGTTAATAATATAG AGTTTTAGCAGGAGCAATTACTAT TTAATWCCWGTWGGNACNGCAATRATTAT TACAGTTGGAATAGACGTTGATAC AAAAATGTTGAGGGAAAAATGTTA GTAAACCTAACATTTTTTCCTCAACA TCCAATGCACTAATCTGCCATATTA

Q 1996 Blackwell Science Ltd, lnsect Molecular Biology 5: 153-165

162

D. H. Luntet al.

El ~1

Region

1.6 1.7

M4

I1

NH2

Average variability

M3

1.8

1.2

1.6

1.7

M7

M8

I3

I2

1.9 1.1

M6

M5

E5

E4

E3

E2 M2

1.3

1.2

2.1

1.o

M9

E6 M10

1.o

1.5

1.1

1.6

2.1

MI2

I5

I4

1.2

M11

1.2

1.3

cI)cH

1.3

1.8

1.5

2.3 2.5

- 1

0

1

amino acid position Flgure 5. An illustrationof the distributionof amino acid variability (number of different amino acids per residue site) along the insect COI gene. The twenty-five structural regions, and their mean levels of variability, are shown above the graph. The position of the ten conserved primers are given by arrows, primers UEAl and UEAIO are in the tRNA-Tyr and tRNA-Leu genes respectively.

areas of differing variability. From the multiple alignment of the DNA sequences of nine insect COI genes presented in Fig. 1, it can be seen that several areas are very generally conserved and the same, or slightly modified, primers may be applicable to many organisms. Primers were thus designed to cover the whole of the insect COI gene and to be positioned to aid the sequencing of regions of different variability. The primers are described in Table 3 and their positions shown in Fig. 5. Primer UEA10, located in the tRNALeu gene, although identified independently in this study, turns out to be the general insect primer 'PAT' designed in the lab of R. Harrison (Cornell University, pers. comm.). To test the universality of the conserved primers identified in this study, PCR amplifications have been carried out for nine insect taxa. These taxa cover the main divisions of the class Insecta, from wingless insects of the order Thysanura (silverfish, Lepisma saccharina, and firebrat, Thermobia dornestica), to winged insects of the orders Odonata (damselfly, Calopteryx splendens), Orthoptera (desert locust, Schistocerca gragaria and meadow grasshopper Chorthippus parallelus), Hemiptera (pea aphid, Acyrthosiphon pisum), Coleoptera (beetle, Carabus vidaceous), Diptera (fruit fly, Drosophila melanogaster) and Hymenoptera (bumble bee, Bombus lapidar-

ius). Figure 6 shows the amplification product obtained using the primer pair UEA3-UEA8. These primers cover a large part (1018 bp) of the COI gene from the I1 region to M11 region (see Figs 3 and 5). A single main band of the expected size was produced in all insect taxa described above with the exception of the pea aphid, A. pisum (no product under the assay conditions, although other primers amplify the COI gene in this aphid). Direct terminal sequence analysis of the amplified PCR fragments confirms that they are the COI gene and that all sequences are taxa-specific, thus excluding the possibility of crosscontamination (terminal sequences of the PCR fragments for these species are available from the authors). A preliminary phylogenetic analysis of these sequences further confirmed the authenticity of the amplified bands. PCR amplifications with other primer combinations indicate that most of the primers described here work well in many of these different taxa. Degenerate forms of some of these primers have also been designed based on the sequence alignment in Fig. 1. We are currently investigating further the universality of all these primers and results of this analysis will be reported later. Initial results indicate that the primers identified in this study are quite broadly conserved, with primers UEA7, UEA9 and UEA10 working well even between the superclasses lnsecta and Arach-

01996 Blackwell Science Ltd, Insect Molecular BiologyS: 153-165

Insect COI gene evolution and conserved primers

-1 kb --t

-0.7 kb 4-

163

Figure6 PCR products amplified from nine different arthropod taxa using primer pairs UEAWEAB or UEA74EA10. A specific band of. . 1 kb was amplified from all taxa except the pea aphid, A. pisum, using the primer pair UEAIUEAB. The -0.7 kb PCR product for A. pisumshowedin the photograph was obtained using primer pair UEA74EAlO. The marker is a 123 bp size ladder

(GIBCOBRL).

nida, which are thought to have diverged during the Devonian period at least 400 million years ago (Pearse etal., 1987). As a general guide to the applicability of these primers, regions amplified by primer combinations UEASUEAG, UEA5-UEA8 or UEA7-UEA8 could be suitable for higher-level evolutionary studies (e.g. at genus or family level); regions amplified with primer pairs UEA3-UEA4 or UEA7-UEA10 should be more variable and thus suitable for lower-level analyses (such as study of intraspecific variation, and the phylogenetics of closely related species). The use of the last two primer pairs (UEA3-UEA4 and UEA7-UEA10) in population analysis of beetles, grasshoppers and other insects in our laboratory suggests that these regions are probably variable enough for revealing intraspecific polymorphisms (unpublished data, G.M.H. etal.). In this report we show that a detailed examination of the evolutionary patterns of a DNA region could provide valuable guidelines for its effective use as a molecular marker in phylogenetic studies. While this paper was in review, Simon et al. (1994) published a most useful compendium of conserved primers covering the whole insect mitochondrial genome. Increasing practices employing such primers will certainly reveal more information on the evolutionary patterns of other mitochondrial genes, which will in turn help us to assess their usefulness for addressing phylogenetic questions of different taxonomic levels.

01996 Blackwell Science Ltd, lnsect Molecular Biology5: 153-165

Experimental procedures Insect COI gene sequences The sequence of the C.parallelus COI gene was obtained as part of the complete sequence of a 6.4 kb mtDNA Hindlll fragment as described in Zhang et a/. (1995). Both strands of the COI gene have been sequenced. Other insect DNA sequences were taken from the GenBank and EMBL databases, with the exception of the Locusta migratoria sequence which was kindly supplied by P. Flook prior to publication. The accession numbers of these sequences are L20934 (Anopheles gambiae), LO4272 (Anopheles quadrimaculatis), L14946 (Phormia regina), LO6178 (Apis mellifera), X03240 (Drosophila yakuba), M57908 (Drosophila sechellia), M57911 (Drosophila simulans) and X80245 (Locusta migratoria) (see Table 1 for individual references). All sequences were aligned by the Clustal V method (Higgins & Sharp, 1989) and translated with the invertebrate mitochondrial genetic code using programs implanted in the LASERGENE computer software package (DNASTAR). The aligned DNA sequences were then examined manually using amino acid sequences and codon positions as references, producing the alignment shown in Fig. 1 which seems to be quite robust.

Analysis of variability and statistical tests The aligned amino acid sequence was divided into twenty-five regions comprising five structural classes (twelve transmembrane helices, six external loops, five internal loops, carboxy and amino terminals: Fig. 3). The points of transition between these regions were taken from Gennis (1992). The number of different amino acids observed at each position of the protein alignment was recorded and the variability level expressed as the average number of amino acids per site observed in a

164

D. H. Luntet al.

given region. A spreadsheet written (by J. B. Lunt and D. H. Lunt) to score the variability across such alignments is available from the authors. This analysis was limited to the 498 homologous positions between the ends of the shortest sequence. Insertion and deletion events were scored equally to the possession of a novel amino acid. Statistical tests were carried out using the StatView SE v1.03 (Abacus Concepts Inc.) software package. A KruskalWallis test (analysis of variance by ranks) was performed on the data sets to test the null hypothesis (Ifo):there is no difference between the average amino acid variability, per site, between the five structural classes. In the event of this h being rejected, an analysis would be performed to determine between which of the samples significant differences occur. This analysis followed the method described by Zar (1984). Conserved primers, PCR amplification and direct sequencing

Primers for amplifying and sequencing the whole of the COI gene were designed using the Oligo 4.0 (National Biosclences Inc.) software package following the guidelines given by Rychlik (1992). Nine insect taxa, which cover the main divisions of the superclass Insecta, were used to test the primers identified in this study, viz: Lepisma saccharina (silverfish, Apterygota, order Thysanura), Thermobia domestica (firebrat, Apterygota, order Thysanura), Calopteryx splendens (damselfly, Pterygota, order Odonata), Schistocerca gregaria (locust, Pterygota, order Orthoptera), Chorthippus parallelus (grasshopper, Pterygota, order Orthoptera), Acyrthosiphon pisum (aphid, Pterygota, order Homoptera), Drosophila melanogaster (fruitfly, Pterygota, order Diptera), Carabus vidaceous (beetle, Pterygota, order Coleoptera) and Bombus lapidarius (bee, Pterygota, order Hymenoptera). An arachnid (spider, Tegenaria domesticus) was also included to gauge the broader generality of these primers. DNA was purified from individual insects using a phenol/ chloroform based extraction as described by Zhang et a/. (1995). PCR was carried out in a 50 pl reaction containing 1.52.0 mM MgCI2, 200 PM dNTP, 0.15 p~ of each primer, and 2 units of Taq polymerase (Promega) in 1 x reaction buffer (50 mM KCI, 10 mM Tris-HCI, 0.1% Triton X-100, pH 9.0 at 25"C, Promega). Following an initial denaturation at 94°C for 5 min. thirty to forty cycles were performed in a DNA Thermal Cycler 480 (Perkin Elmer Cetus), each consisting of melting at 95°C for 40 s, annealing at 4845°C for 1 min, and extension at 72°C for 40 s to 1 min 40 s. 5 PIof PCR product were used for direct DNA sequencing using the Sequenase PCR Products Sequencing Kit (USB-Amersham) following the manufacturer's protocol.

Acknowledgements

We acknowledge the help of Pamy Noldner, Marcus Rowcliffe, Nick Watmough, John Noble-Nesbit, Graham Hopkins and John Lunt. We also thank three anonymous reviewers for their valuable comments. We are grateful to Paul Flook for providing us with the Locusta COI sequence prior to publication. This work was supported by grants from the S.E.R.C. and E.U.

References Avise, J.C., Arnold, J., Ball, R.M., Bermingham, E., Lamb, T., Neigel J.E., Reeb, C.A. and Saunders, N.C. (1987) lntraspecific phylogeography: the rnitochondrial DNA bridge between population genetics and systematics. Annu Rev fcol Systl8 489-522. Ballard, J.W., Olsen, G.J., Faith, D.P., Odgers, W.A., Rowell, D.M. and Atkinson, P.W. (1992) Evidence from 12s ribosomal RNA sequences that Onychophorans are modified arthropods. Science258: 1345-1348. Beard, C.B., Hamm, D.M. and Collins, F.H. (1993) The mitochondrial genome of the mosquito Anopheles gambiae: DNA sequence, genome organization, and comparisons with rnitochondrial sequences of other insects. lnsect MolBiol2: 103-124. Clary, D.O. and Wolstenholme. D.R. (1985) The mitochondrial DNA molecule of Drosophila yakuba: nucleotide sequence, gene organization, and genetic code. J Molfvol22: 252-271. Crozier, R.H. and Crozier, Y.C. (1993) The mitochondrial genome of the Honeybee Apis mellifera: complete sequence and genome organization. Geneticsl33:97-117. Flook, P.K., Rowell, C.H.F. and Gellissen G. (1996) The sequence, organisation and evolution of the Locusta migratoria mitochondrial genome. JMolEvol, in press. Gennis, R.B. (1992) Site-directed mutagenesis studies on subunit I of the aa3-type cytochrome c oxidase of Rhodobacter sphaeroides a brief review of progress to date. Biochim Siophys Acta 1101: 184-187. Harrison, R.G. (1989) Animal mtDNA as a genetic marker in population and evolutionary biology. Trends fcolEvol4 6-1 1. Hauke, H A . and Gellisson, G. (1988) Different mitochondrial gene orders among insects: exchanged tRNA gene positions in the COlllCOlll region between an orthopteran and dipteran species. Cuff Genetics1 4 471-476. Higgins, D.G. and Sharp, P.M. (1989) Fast and sensitive multiple sequence alignments on a microcomputer. Computer Applic Biosci5 151-153. Holm, L.. Saraste, M. and Wilkstrom, M. (1987) Structural models of the redox centers in cytochrome-oxidase. EMBOJB: 2819-2823. Howland, D.E. and Hewitt, G.M. (1995) Phylogeny of the Coleoptera based on mitochondrial cytochrome oxidase I sequence data. lnsect Mol BiolB 203-215. Liu, H. and Beckenbach, A.T. (1992) Evolution of the rnitochondrial cytochrome oxidase II gene among ten orders of insects. Mol Phylogen f v o l l : 41-52. Lunt, D.H. (1994) MtDNA differentiation across Europe in the meadow grasshopper Chorthippusparallelus (Orthoptera: Acrididae). Ph.D. thesis, University of East Anglia, Norwich. Moritz, C., Dowling, T.E. and Brown, W.M. (1987) Evolution of animal mitochondrial DNA: relevance for population biology and systematics. Annu RevEcolSystl8 269-292. Ojala, D., Montoya, J. and Attardi, G. (1981) tRNA punctuation model of RNA processing in human mitochondria. Nature 290: 470474. Paabo, S., Thomas, W.K., Whitfield, K.M., Kumazawa, Y. and Wilson, A.C. (1991) Rearrangements of mitochondrial transferRNA genes in marsupials. J MolEvol33: 426430. Pearse, V., Pearse, J., Buchsbaum, M. and Buchsbaum, R. (1987) Living Invertebrates. The Boxwood Press, Pacific Grove, California. Rychlik, W. (1992) Oligo Version 4.0: Reference Manual. National Biosciences, Inc., Plymouth, Minnesota.

0 1996 Blackwell Science Ltd, lnsect Molecular Biology5: 153-165

Insect COI gene evolution and conserved primers Saraste, M. (1990) Structural features of cytochrome oxidase. 0 Rev Biophys23: 331366. Satta, Y., Ishiwa, H. and Chigusa, S.I. (1987) Analysis of nucleotide substitutions of mitochondrial DNAs in Drosophila melanogasterand its sibling species. J Mol Biol4 638-650. Simon, C. (1991) Molecular systematics at the species boundary: exploiting conserved and variable regions of the mitochondrial genome of animals via direct sequencing from amplified DNA. M o k u l a r Techniques in Taxonomy (Hewitt, G.M., Johnston, A.W.B.,Young, J.P.W., eds), pp. 33-71. Springer, Berlin. Simon, C., Frati, F., Beckenbach, A,, Crespi, B., Liu, H. and Flook, P. (1994) Evolution, weighting, and phylogenetic utility of mitochondrial gene-sequences and a compilation of conserved

01996 Blackwell Science Ltd, lnsect Molecular BiologyJ: 153-165

165

polymerase chain-reaction primers. Ann Ent SocAmer87: 651701. Smith, M.J., Arndt, A., Gorski, S . and Fajber, E. (1993) The phylogeny of echinoderm classes based on mitochondrial gene arrangements. JMolEvo136 545654. Wolstenholme, D.R. (1992a) Animal mitochondria1 DNA: structure and evolution. lntRevCytoll41: 173-216. Wolstenholme, D.R.(1992b) Genetic novelties in mitochondrial genomes of multlcellular animals. Curr Op GenetDevel2 91&925. Zar, J.H. (1984) BiostatisticalAns/ysis. Prentice-Hall, London. Zhang, D.-X., Szymura, J.M. and Hewitt, G.M. (1995) Evolution and structural conservation of the control region of insect mitochondrial DNA. JMolEvol40: 382-391.

The insect cytochrome oxidase I gene: evolutionary ...

Nov 3, 1995 - Identity to C. parallelussequence is denoted by period, a deletion by a dash. ..... Ojala, D., Montoya, J. and Attardi, G. (1981) tRNA punctuation.

1MB Sizes 0 Downloads 136 Views

Recommend Documents

Differential gene expression of NADPH oxidase
Hitachi-912 Autoanalyser (Hitachi, Mannheim, Germany) using kits supplied by ...... scientific sessions, San Diego, California, June 10–14, 2005; 54; 922-P.

Insect Test.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. Insect Test.pdf.

Polyphenol Oxidase Enzyme Variability Among ...
Hyderabad - 560 030, Andhra Pradesh, India. ... different agro-climatic conditions. ... C. gloeosporioides isolates from mango collected from nine agro climatic ...

Crystallization of carbohydrate oxidase from ...
May 11, 2009 - Analysis of the data quality showed almost perfect ... industrial process and is described in detail by Nordkvist et al. (2007). .... data processing.

The Selfish Gene
directly for gene survival resources from food through to ... adopt it, cannot be bettered by an alternative .... source of energy but mitochondria may in fact have.

Changes in enzyme activities (polyphenol oxidase and ...
of most interest in the tea plant because of its key role in tea fermentation during black tea manufacture (Roberts and Myers, 1960; Takino and Imagawa, 1963; ...

Lornoxicam pharmacokinetics in relation to cytochrome ...
Dec 5, 2003 - cam was significantly greater in *1 heterozygotes than .... 2 Bonnabry P, Leemann T, Dayer P. Role of human liver microsomal. CYP2C9 in the ...

Filling a Hole in Cytochrome P450 BM3 Improves ...
Aug 21, 2007 - Available online. 21 August ..... enzyme for fatty acids and also in the degree of conversion ...... were measured on completion of the reaction.

Insect Interordinal Relationships: Evidence from the Visual System
Dec 1, 2006 - Insect visual system, Strepsiptera, evolution of development, eye development, ocellus, stemma, ... molecular data support Tricholepidon gertschi (Lepi- .... Rather than attempting exhaustive analysis of char- ..... modern tools.

Insect-host phenological synchrony - The Ohio State University
host phenology, and repeat feeding assays* over time (before, during, and after natural hatch), measuring changes in insect growth and survival. * Bioassays: ...

Insect and non-insect pests of mulberry silkworm Notes 1.pdf ...
(Integrated pest management package) against Uzi fly was. developed at Central Tasar Research and Training institute,. Ranchi, India, which involves implementation of mechanical,. chemical (use of bleaching powder solution as ovicide) and. biological

Changes in enzyme activities (polyphenol oxidase and ...
photometrically at 380 nm in a Hitachi Model 150-20 spectrophotometer with 0.3 ml of 0.01 M .... end-product on storage. This study shows very clearly how the ...

A Hexapod Robot Modeled on the Stick Insect ...
from an on-robot computer, execute the neurobiological leg control model .... to the Router. Board, which is used to interface with the Host Computer via USB.

Watch The Insect Woman (1963) Full Movie Online HD Streaming ...
Watch The Insect Woman (1963) Full Movie Online HD Streaming Free Download _.pdf. Watch The Insect Woman (1963) Full Movie Online HD Streaming Free ...