Mitochondrial Data in Monocot Phylologenetics

Mitochondrial sequences are an important source of data in animal phylogenetics, equivalent in importance to plastid sequences in plants. However, in recent years plant systematists have begun exploring the mitochondrial genome as a source of phylogenetically useful characters. The plant mitochondrial genome is renowned for its variability in size, structure, and gene organization, but this need not be of concern for the application of sequence data in phylogenetics. However, the incorporation of reverse transcribed mitochondrial genes ("processed paralogs") and the recurring transfer of genes from the mitochondrion to the nucleus are evolutionary events that must be taken into account. RNA editing of mitochondrial genes is sometimes considered a problem in phylogenetic reconstruction, but we regard it only as a mechanism that may increase variability at edited sites and change the codon position bias accordingly. Additionally, edited sites may prove a valuable tool in identifying processed paralogs. An overview of genes and sequences used in phylogenetic studies of angiosperms is presented. In the monocots, a large amount of mitochondrial sequence data is being collected together with sequence data from plastid and nuclear genes, thus offering an opportunity to compare data from different genomic compartments. The mitochondrial and plastid data are incongruent when organelle gene trees are reconstructed. Possible reasons for the observed incongruence involve sampling of paralogous sequences and highly divergent substitution rates, potentially leading to longbranch attraction. The above problems are addressed in Acorales, Alismatales, Poales, Liliaceae, the "Anthericum clade" (in Agavaceae), and in some achlorophyllous taxa.


INTRODUCTION
Mitochondrial DNA sequence data have long been an important source of data in animal phylogenetics (e.g., Curole and Kocher 1999), equivalent in importance to the use of plastid DNA data in plant studies (e.g., Soltis and Soltis 1998).The reluctance among botanists to use mitochondrial data has been induced primarily by the pronounced structural diversity of plant mitochondrial genomes, caused at least in part by their ability to recombine (Palmer 1992;Backert et al. 1997).Structural instability coupled with a low level of sequence diversity rendered the use of restriction endonuclease site variability impractical in the early days of molecular systematics (Palmer 1992).By the time systematists turned to DNA sequence data, the focus on plastid and nuclear genomes already had been established.However, the structural diversity of the mitochondrial genome is generally not a problem for the use of mitochondrial sequence data in phylogenetic reconstruction, and in the past ten years plant systematists have begun exploring the mitochondrial genome as a new source of useful characters (Hiesel et al. 1994;Soltis and Soltis 1998).There are several advantages to using mitochondrial data in phylogenetic reconstruction.Most importantly, mitochondria belong to a separate linkage group from plastids, and hence provide independent phylogenetic evidence.Further, mitochondrial sequences may be the only data that are able to place achlorophyllous taxa in phylogenies based on data derived from the organelle genomes.A brief review of mitochondrial sequences used in angiosperm phylogenetics is given below, together with a short account of the genomic composition of plant mitochondria.Recent studies have demonstrated that features of the plant mitochondrial genome other than structural diversity may potentially create problems for phylogenetic reconstructions.These potential problems, which include RNA editing, gene duplication (paralogy), and gene transfer (e.g., Bowe and dePamphilis 1996;Palmer et al. 2000;Bergthorsson et al. 2003), will be discussed below.
An increasing amount of mitochondrial sequence data is being collected for the monocots.Part of the data presented here (for the gene atp1) is included in the papers by Chase et a!. (2006) on monocot phylogeny, by Pires et a!. (2006) on the phylogeny of Asparagales, and by Fay et a!. (2006) on the phylogeny of Liliales.These data, together with unpublished data from another mitochondrial gene, coding for cob (cytochrome oxidase B), will be used to explore incongruence between data from the three genomic compartments and evaluate the potential problems caused by gene duplication and gene transfer.Details on cob data sampling and data analysis will be published elsewhere.The circumscription of higher taxa follows the Angiosperm Phylogeny Group II system (APG II 2003).

GENOME COMPOSITION AND EVOLUTION
Six complete mitochondrial plant genomes have been sequenced to date.These are from Marchantia (Oda et a!.1992), Arabidopsis (Unseld et a!.1997), Beta (Kubo et a!. 2000), Oryza (Notsu et a!. 2002), Brassica (Handa 2003), and Zea (GenBank accession number A Y506529; S. Clifton, C. Fauron, M. Gibson, P. Minx, K. Newton, M. Rugen, J. Spieth, and H. Sun unpubl.data).Another rather well-characterized monocot mitochondrial genome is that of Triticum (Bonen 1995).The plant mitochondrial genome varies more than tenfold in size: from some 200 to 2400 kb (Gray et a!. 1998).In contrast, the gene content of higher plants is relatively conserved, though remarkable cases of gene loss exist (Adams et a!. 2002).The gene complement includes a maximum of 40 protein-coding genes, 14 ribosomal protein genes, 23 genes involved in the respiratory chain, and 3 other protein-coding genes.Additional open reading frames for putative protein products may occur (e.g., Notsu et a!. 2002).The mitochondrial genome also includes a number of non-protein-coding genes including tRNA and rRNA genes.
The mitochondrial genomes of higher plants include long sections of sequences derived from the plastid and nuclear genomes.In the Oryza mitochondrion, imported sequences constitute almost 20% of the entire genome (Notsu et a!. 2002).This fraction is lower (approximately 5%) in two of the completely sequenced dicot mitochondrial genomes (Marienfeld et a!. 1999;Kubo et a!. 2000).Sequences of nuclear origin are mainly transposable elements, in particular retrotransposons (Marienfeld et a!. 1999;Kubo et a!. 2000;Notsu et a!. 2002).Sequences of plastid origin include tRNA genes, protein-coding genes, and noncoding regions.Whereas the tRNA genes are functional in the mitochondrial genomes, the protein-coding genes seem to lose their function and become pseudogenes (e.g., Notsu eta!. 2002;Cummings et a!. 2003).However, sequence migration appears to be an ongoing process with newly transferred sequences being identical to their plastid counterparts (Marienfeld eta!. 1999;Notsu et a!. 2002).
A number of mitochondrial genes have been used in angiosperm phylogenetics (see Table 1).A few additional genes (nad2 and nad5) have been applied within bryophytes, pteridophytes, and gymnosperms (Beckert eta!. 1999(Beckert eta!. , 2001;;Vangerow et a!. 1999;Wang et a!. 2000;Pruchner et a!. 2002;Shaw et a!. 2003;Cox et a!. 2004;Hyvonen et a!. 2004).Cytochrome oxidase 3 (cox3) has also been used to reconstruct tracheophyte or embryophyte phylogeny (Riesel et a!. 1994;Bowe and dePamphilis 1996).Mitochondrial genes have been used mainly at high taxonomic levels (see Table 1) due to the conserved nature of the coding sequences (Wolfe eta!. 1987;Laroche eta!. 1997, but see e.g., Szalanski et a!. 2001;Sanjur et a!. 2002;Chat et a!. 2004 for generic level studies).Evolutionary rates differ among mitochondrial genes with the nonsynonymous substitution rates varying up to 30-fold (synonymous substitution rates vary only up to 4-fold).Only the more rapidly evolving mitochondrial genes (e.g., atp9) reach substitution rates comparable to those of slowly evolving plastid genes (Laroche et a!. 1997).Mitochondrial introns and other noncoding regions evolve at faster rates and have been used in phylogenetic reconstruction at lower taxonomic levels (see Table 1).The substitution rate in introns is comparable to the synonymous substitution rates in exons (Laroche eta!. 1997;Laroche and Bousquet 1999), but, as for noncoding sequences of the plastid and nuclear genomes, length mutations are abundant.Hence, the rate of character variation and potential alignment problems should be considered before choosing a noncoding region for phylogenetic reconstruction.Intergenic regions involved in recombinational activity should probably be avoided.Lists of more or less "universal" primers designed for amplification of coding, as well as noncoding mitochondrial regions were provided by Demesure eta!. (1995), Dumolin-Lapegue et a!.(1997), andDuminil et a!. (2002).Given the limited number of mitochondrial sequences used in plant phylogenetics and other evolutionary studies, our knowledge of substitution rates and patterns is still fragmentary.However, it has become clear that the substitution pattern is affected by RNA editing (see below), which may vary drastically among gene sequences (e.g., Notsu et a!. 2002).Dramatic differences in substitution rates among taxa have also been reported (Adams eta!. 1998;Palmer et a!. 2000), but whether these are artifacts caused by paralogous sequences (see below) is unclear.

RNA Editing
RNA editing is a typical feature of the mitochondrion of embryophytes except in some groups of thalloid liverworts (Steinhauser et a!. 1999).RNA editing is a post-transcriptional process involving pyrimidine exchanges in mRNAs or tRNAs.In the angiosperms changes from C to U dominate, whereas changes from U to C are rare.Editing affects most if not all mitochondrial protein-coding genes, but to a different extent.Levels of editing varying from 0-19% of the codons have been reported (Giege and Brennicke 1999).A clear positional bias can be observed with more than half of all edited sites being 2nd positions; the remaining sites are either all 1st positions or predominantly so, with a few 3'd positions (Pesole et a!. 1996;Giege and Brennicke 1999;Szmidt et a!. 200 I).This positional bias affects the bias in substitution rates because sites prone to editing become free to vary more.Whereas unedited nuclear or plastid genes often have a strong bias towards 3'd position changes, and 2nd positions usually are the most conserved, mitochondrial genes tend to have less substitution bias (Pesole eta!. 1996).In the rps 12 gene with approximately 9% edited codons, a positional bias close to 2:3:4 was observed (Pesole et a!.Qiu et a!. 1999Qiu et a!. , 2000Qiu et a!. , 2001;;Barkman et a!. 2000;Duvall 2000;Nickrent et a!. 2002;Bergthorsson et a!. 2003;Zanis et a!. 2003Davis et a!. 1998, 2004;Stevenson et a!. 2000 a Higher-level phylogenies of e.g., embryophytes (Qiu and Palmer 2004) are not included, irrespective of the number of included angiosperms.
b Only papers published prior to 14 Sep 2004 or published in this volume.' Mainly intron sequence.ct Gene and intergene regions.1996).Noncoding sites, e.g., within introns, may also be affected by RNA editing (Bonen et al. 1998;Giege and Brennicke 1999).
In a phylogenetic context, it has been suggested that eDNA sequences should be used rather than genomic sequences, as the latter do not predict the protein sequences (Riesel et al. 1994).Alternatively, the edited sites have been excluded from phylogenetic analysis of genomic DNA sequences (e.g., Bergthorsson et al. 2003).However, transcriptional editing is usually not considered a hindrance to phylogenetic analysis of genomic DNA: introns are frequently used, though they are excised during the transcriptional process.Empirical studies have demonstrated limited topological differences between trees derived from analyses of either genomic DNA or eDNA or from analyses including or excluding edited sites (Bowe and dePamphilis 1996;Vangerow et al. 1999;Szmidt et al. 2001;Davis et al. 2004).The main effects of excluding edited sites or using eDNA sequences seem to be a decrease in the number of informative sites and a minor decrease in the number of resolved clades.The increased variability at edited sites may potentially increase homoplasy at these sites, as has been observed with pteridophyte nad5 gene sequences (Vangerow et al. 1999) and monocot atp1 sequences (Davis et al. 2004).However, the potential increase in homoplasy at edited sites should not necessarily be of concern (as long as genomic sequences are consistently used rather than being intermixed with eDNA sequences), just as increased homoplasy of 3'd positions in unedited genes (e.g., rbcL) is not necessarily a problem.Kallersjo et al. (1998) demonstrated that the homoplasious 3'ct positions even added substantially to the structure of the tree.A directly copied DNA sequence (inserted as indicated by the lower stippled arrow) is initially identical to it progenitor sequence and the two sequences will be placed as sister groups.A processed paralog (inserted as indicated by the upper stippled arrow) including edited sites (*) may theoretically be placed anywhere in the tree and may cause topological changes.In this case the shaded branch collapses.paralogous gene copy.Such inserted sequences, which have also experienced RNA editing, have been termed "processed paralogs" (Bowe anddePamphilis 1995, 1996).Mixing paralogous sequences in phylogenetic analysis has always presented a potential source for error in reconstructing taxon phylogenies.
A sequence created through simple duplication will initially be identical to the original sequence.Hence, whether one or the other is included in phylogenetic reconstruction is irrelevant; if both are included, they will constitute a clade (Fig. 1).Only following sequence divergence may inadequate sampling of paralogs cause problems.However, a processed paralog that has experienced RNA editing is from the time of insertion in the genome different from the original sequence.Hence, the original gene and the processed paralog may no longer constitute a clade in a phylogenetic tree (Fig. 1).Inclusion of just the processed paralog may give misleading results.
Processed paralogs, once generated, may be inserted into the mitochondrial or nuclear genome.Transfer of processed paralogs between genomic compartments appears to be a recurring phenomenon in plant phylogeny.A survey of 280 angiosperms has recently demonstrated that many genes are lost from the mitochondrial genome and may instead be found in the nucleus (Adams et al. 2000(Adams et al. , 2001(Adams et al. , 2002;;Palmer et al. 2000;Adams and Palmer 2003).Each of the 14 ribosomal-protein genes has been lost from the mitochondrial genome one or more times (up to 42 times in all) during angiosperm evolution (Adams et al. 2002).This is also true for two of the remaining protein-coding genes (sdh3 and sdh4), whereas the rest are only rarely or never lost (Adams et al. 2002).Among those genes reported to be lost only once, are nad3 in Piperaceae (Palmer et al. 2000) and cox2 in Fabaceae (Adams et al. 1999;Palmer et al. 2000), the latter being the best-documented case to date.Other genes, which may be transferred to the nucleus include cob, atp1, and nad1 (see below).Just as some genes are more readily lost than others, there are differences in the amount of gene loss among taxa.Certain taxa among the monocots, including Allium, Lachnocaulon, and members of Alismatales (excluding Araceae and Tofieldiaceae), henceforth referred to as the alismatids, experienced a massive loss of genes, whereas others, such as most members of Zingiberales, have retained a complete set of mitochondrial genes (Adams et al. 2002).
The case of the cox2 gene in Fabaceae demonstrates the complexity of loss and transfer of genes (Adams et al. 1999;Palmer et al. 2000).Some species have two functional gene copies, one in the mitochondrial genome and one in the nucleus.Others have only one functional copy, either in the nucleus or in the mitochondrion.However, the nonfunctional copy may still be present (but possibly truncated) or it may be entirely missing.In addition to the cox2 case, in which the nuclear copy is a processed paralog, it has also been demonstrated that even regions comprising hundreds of kilobases of mitochondrial genomic DNA can be directly transferred to the nucleus (Stupar et al. 2001).
In a phylogenetic context, this implies that great caution should be taken both in data collection and in the interpretation of results (see also Bowe and dePamphilis 1996).After duplication the two copies will have different histories, and one or the other copy may degenerate or disappear in some lineages.In the worst-case scenario, the original sequence may disappear while the processed paralog remains.Two aspects of sequence evolution may help to identify sequences as processed paralogs: they may show signs of editing (Ts instead of Cs at edited sites and lack of introns), and if transferred to the nucleus they may show accelerated substitution rates and different codon position biases compared to the sequences located in the mitochondrion.
However, it has been suggested, but not yet conclusively proven, that the entire mitochondrial genome in some taxa (e.g., Geraniaceae and Plantago) has a drastically accelerated rate of substitution (Adams et al. 1998;Palmer et al. 2000).In these cases, rate differences alone cannot be used to postulate a nuclear location.Likewise, a mitochondrial gene experiencing an accelerated rate of substitution, but only little editing, may have a codon bias comparable to nuclear genes.Thus, the only conclusive way to determine the location of a sequence may be through direct observation (e.g., in situ PCR), which for shorter, single copy sequences in the nucleus is not a simple task.Indirect, but less conclusive, evidence may be achieved through southern and/or northern hybridization (e.g., Adams et al. 1998Adams et al. , 1999Adams et al. , 2001)).

Horizontal Gene Transfer
In three recent papers it has been suggested that mitochondrial sequences may be horizontally transferred between distantly related plants: Bergthorsson et al. (2003) describe five potential cases of transfer between angiosperms involving the genes atp 1, rps2, and rps 11; Davis and Wurdack (2004) argue that the endophytic parasites in Rafflesiaceae have acquired a mitochondrial nad1 sequence from their host Tetrastigma (Vitaceae); and Won and Renner (2003) describe a potential transfer of a nadl sequence from an euasterid to Asiatic species of Gnetum.A plausible mechanism facilitating incorporation of foreign mitochondrial DNA into the genome of the recipient has yet to be described.
Horizontal gene transfer has in all cases been postulated on the basis of unexpected positions of sequences on mitochondrial gene trees.However, horizontal transfer, such as lineage sorting, introgression, etc., is only one of several  2006) is based on sequences from seven genes from three genomic compartments: two nuclear genes (18S and partial 26S rDNA), four plastid genes (rbcL, matK, atpB, ndhF), and one mitochondrial gene (atpl).In that analysis, mitochondrial sequences provide approximately 8% of the total number of informative characters.The present analysis is based on the same sets of data except that mitochondrial cob sequences are included and nuclear 26S rDNA sequences are excluded.This raises the proportion of phylogenetically informative characters from mitochondrial genes to approximately 14%, but leaves the phylogenetic pattern only slightly changed except for the positions of Arachnitis (Corsiaceae; see below) and Trithuria (Hydatellaceae), which in our analysis is the sister to Thurniaceae, but in Chase et al. (2006) is included in Burmanniaceae.The trees derived from analysis of the seven genes used in the present study are not shown here in detail, but see Chase et al. (2006).

Data Incongruence
Data from the mitochondrial genome are incongruent with data from the plastid genome.So far, we have only explored congruence by running analyses of different data partitions, but our observations are in agreement with the incongruence-as measured by the ILD (incongruence length differ-ence; Farris et al. 1995) test-between rbcL and atpl data reported by Davis et al. (2004).Our separate analyses of the plastid sequences and the mitochondrial sequences result in different positions of several major groups of monocots (Fig. 2).Analysis of the nuclear 18S rDNA data alone results in a largely unresolved tree (not shown), but data from many taxa are missing and at this stage comparisons to the organellar gene trees would be premature.With respect to the major groups, the plastid data recover the same tree structure (Fig. 2A) as the combined analysis (Chase et al. 2006).However, the tree based on mitochondrial sequences (Fig. 2B) shows a number of more or less controversial groupings: Acoraceae, plus the alismatids, as sister to Poales; Dioscoreales (excluding Burmanniaceae and Nartheciaceae) as sister to Orchidaceae, which are not part of Asparagales, and Liliales as sister to Asparagales (excluding Orchidaceae ).Preliminary results suggest that the Dioscoreales/Orchidaceae and Liliales/ Asparagales relationships are not robust with the addition of more taxa.

Acorales and the Alismatids
When changes in each of the mitochondrial genes are mapped onto the phylogenetic trees derived from the combined analysis of all seven genes, significant branch-length differences are observed (Fig. 3A, B).This is not the case for the plastid gene sequences (not shown).Both atpl and cob show increased branch lengths for Acorales, the alismatids, and some taxa in Poales.It might be tempting to explain the unexpected sister-group relationship indicated by the mitochondrial data between these taxa as long-branch attraction.However, preliminary analyses, excluding either Acorales plus alismatids or Poales (entirely or partly), do Fig. 3A and B.-Tree obtained from a combined phylogenetic analysis of four plastid data sets (rbcL, atpB, ndhF, matK), two mitochondrial data sets (cob, atpl), and one nuclear data set (l8S) for 125 monocot taxa and 16 dicot outgroup taxa.Branch lengths reflect character changes in cob (A) and atpl (B).Taxon names are only shown for some "long branches" and groups mentioned in the text.
Table 2. Codon position bias in some monocot groups.'Data set including 135 taxa.The codon bias for Asparagales excludes the "Anthericum clade."not show any effect on the position of the remaining taxa (see Siddall and Whiting 1999).Horizontal gene transfer would offer an alternative ad hoc explanation.
A noteworthy correlation exists between the controversial groupings of taxa cited above and those taxa found by Adams et al. (2002) to lack multiple mitochondrial genes (alismatids and Lachnocaulon [Eriocaulaceae]).Although Adams et al. (2002) did find both atpl and cob present in the mitochondrial genome, it is still entirely possible that the mitochondrial orthologs are fragmentary and that mitochondrial or nuclear paralogs occur as well.Among the alismatids further evidence for the occurrence of processed paralogs exists for the nadl gene, for which intron loss has been observed in some species (Gugerli et al. 2001;Petersen pers. obs.).Thus, we cannot exclude the possibility that the cob and atp 1 sequences that we have sampled from the alismatids and Acoraceae are nuclear paralogs.If so, a shift in codon position bias towards 3'd position changes is to be expected.Comparisons of the codon position bias in Acoraceae and the alismatids with the bias in Araceae and Tofieldiaceae do show a minor shift towards 3'd position changes in the alismatids and in Acoraceae, but not at the level observed in other groups (Table 2).The minor positional bias observed in the cob sequences compared to the atp 1 sequences (Table 2) may be caused by the presence of more edited sites in cob than in atp 1.In rice, cob has almost four times as many edited sites as atp 1 (Notsu et al. 2002), but this difference may not apply to all monocots, just as it does not apply to all dicots (e.g., Giege and Brennicke 1999).Exclusive occurrence of Ts at RNA edited sites would provide another line of evidence for sequences being processed paralogs.However, the position of edited sites in cob and atp 1 is not yet known for taxa closely related to Acorales and Alismatales, and extrapolation from distantly related taxa may not be meaningful.Albertazzi et al. ( 1998) compared edited sites in the cox2 gene among several taxa and found that less than 1!J of the sites that were edited in Triticum and Zea were edited in Acarus.In the entire mitochondrial genomes of the closely related genera Arabidopsis and Brassica, only 83% of the edited sites are shared (Handa 2003).Hence, other types of data are needed to reveal whether the sequences from Acorales and the alismatids are orthologs or paralogs and in the latter case where they are located.

Achlorophyllous Taxa
The trees also reveal a clear tendency for the achlorophyllous taxa to be placed on longer branches (Fig. 3).Previous studies have shown a general trend towards accelerated substitution rates in both nuclear and plastid genes in certain achlorophyllous taxa (e.g., Duff and Nickrent 1997;Caddick et al. 2002).Our data indicate that an increased substitution rate applies to all genomic compartments.Our matrix includes five achlorophyllous taxa, Sciaphila (Triuridaceae), Burmannia, Thismia (both Burmanniaceae), Arachnitis, and Petrosavia (Petrosaviaceae).The most significant long branches are possessed by Thismia (changes in cob) and Sciaphila (changes in atp1), whereas Petrosavia is placed on a branch of "normal" length (Fig. 3).Accelerated substitution rates might also be observed if the sampled sequences are nuclear paralogs (see above).In this case a shift in codon-position bias towards 3'ct position changes would be expected.However, no such change is observed in Sciaphila and only a moderate change is seen in Thismia, Burmannia, and Arachnitis (data not shown).
In the combined 7-gene tree, Thismia, Burmannia, and Arachnitis form a monophyletic group corresponding to Burmanniales sensu Dahlgren et al. (1985).Monophyly of Dahlgren's Burmanniales was supported by phylogenetic analysis of morphological characters (Stevenson and Loconte 1995).However, recent phylogenetic analyses of molecular data disputed the inclusion of Corsiaceae in Burmanniales, placing the family in Liliales instead (Neyland 2002;Davis et al. 2004;Chase et al. 2006).Given the potentially accelerated substitution rate in the achlorophyllous taxa, longbranch attraction may be postulated to explain these differences.However, separate analysis of the mitochondrial data does not support Dahlgren's Burmanniales, as all three taxa are placed in separate groups: Arachnitis in Liliales, Thismia in Dioscoreaceae, and Burmannia as the sister to Pandanales and Nartheciaceae (Fig. 2).Because no plastid data exist for Arachnitis, its position on the combined 7-gene tree is most likely influenced by the nuclear 18S rDNA data, which shows a significantly increased branch length for Arachnitis and moderately increased branch lengths for Burmannia and Thismia (not shown).
The phylogenetic relationships of the above achlorophyllous taxa are certainly not fully clarified, and apparent minor differences in taxon sampling may strongly influence the outcome of the phylogenetic analyses (e.g., Davis et al. 2004 ).Accelerated substitution rates and lack of plastid genes (or presence of strongly modified sequences) are factors that may confound phylogenetic reconstruction.A much denser taxon sampling may lead to a more stable phylogenetic hypothesis.

Liliaceae and the "Anthericum Clade"
Separate phylogenetic analyses with dense taxon sampling in Asparagales and Liliales reveal more taxa placed on very long branches when changes in the atpl gene are mapped on the trees derived from combined analyses of the complete data sets (Fig. 4A, B).In Liliales, atpl places all species of Liliaceae on long branches (Fig. 4B), not just Lilium, the only representative of the family in the general monocot tree (Fig. 3B).In Asparagales, atpl places all species of the "Anthericum clade" (in Agavaceae; consisting of Anthericum, Chlorophytum, Echeandia, and Leucocrinum) on long branches (Fig. 4A).These long branches do not occur on the general monocot tree (Fig. 3B) due to the lack of atpl sequence data for Chlorophytum, the only representative of the Anthericum clade in the analysis.
The long branches could be due to a generally increased substitution rate in the mitochondrial genome, as suggested for some dicots (Adams et al. 1998;Palmer et al. 2000).In that case, mapping of both cob and atpl would be expected to reveal long branches for the same taxa.This is neither the case for the Anthericum clade nor for Liliaceae, both of which possess nondivergent cob sequences.Alternatively, the long branches suggest that the sequences are from nuclear paralogs.If so, a codon-position bias different from that of mitochondrial sequences would be expected, i.e., an increased 3rct position bias.This is the case for both the Anthericum clade and Liliaceae.In Liliaceae the positional bias is approximately 1:1:10 compared to approximately 1:1:3 for the rest of Liliales, and in the Anthericum clade the positional bias is approximately 3:1:26 compared to approximately 2:1 :5 for the rest of Asparagales.Hence, in both fam-ilies paralogous, nuclear sequences may have been sampled.In addition, preliminary studies indicate that both genes in Liliaceae (but not in the Anthericum clade) are present in more than one copy.Unfortunately, none of these groups were included in the study by Adams et al. (2002), in which 280 angiosperm genera were screened for presence/absence of genes in the mitochondrial genome.

CONCLUSION
Mitochondrial gene sequences offer an important source of characters for phylogenetic reconstruction.It may also be the only reliable source of organellar phylogenetic evidence in achlorophyllous taxa.However, the present data and previously published evidence (e.g., Adams et al. 2002) highlight problems related to the apparently frequent occurrence of paralogous sequences (processed or not).The hypotheses presented here about gene duplication, transfer to the nucleus, and ultimately about paralogy are at this stage based on indirect evidence.Future studies demonstrating the physical location of the sequences will provide better evaluations of our hypotheses.
The observed incongruence between data from the plastid and the mitochondrial genomes may to some extent be caused by inclusion of paralogous sequences in the mitochondrial data sets.However, at present we can only hypothesize about the reason(s), and the existence of incongruence could equally well refute the phylogenies based on plastid data.In general, gene trees derived from different plastid sequences are largely congruent, and combined analyses of plastid data result in well-supported phylogenetic hypotheses.Adding more genes from the same linkage group may increase the support value of clades, but does not refute or corroborate the hypothesis that the plastid tree reflects the species phylogeny.Future phylogenetic studies in the monocots should instead concentrate on additional mitochondrial sequence data, nuclear sequence data (preferably from low or single copy genes), and on producing strongly needed morphological data.

LITERATURE CITED
Fig. I.-Phylogenetic positiOn of paralogous sequences.A directly copied DNA sequence (inserted as indicated by the lower stippled arrow) is initially identical to it progenitor sequence and the two sequences will be placed as sister groups.A processed paralog (inserted as indicated by the upper stippled arrow) including edited sites (*) may theoretically be placed anywhere in the tree and may cause topological changes.In this case the shaded branch collapses.
Fig. 2A and B.-Trees summarizing the position of major groups of monocots.-A.Tree based on the analysis of four plastid data sets (rbcL, atpB, ndhF, and matK) including 139 taxa.-B.Tree based on the analysis of two mitochondrial data sets (cob, and atpl), including 139 taxa.•Excludes Araceae and Tofieldiaceae.•Includes Thismia.'Includes Corsiaceae.dExcludes Orchidaceae.
Fig. 4A and B.-Trees showing accelerated substitution rates of atpl in clades within Asparagales and Liliales.-A.Tree obtained from phylogenetic analysis of two mitochondrial data sets (cob, atp I) including 135 taxa of Asparagales.Branch lengths reflect character changes in atpl.-B.Tree obtained from phylogenetic analysis of two mitochondrial data sets (cob, atpi) and one plastid data set (ndhF) including 43 taxa of Liliales.Branch lengths reflect character changes in atp 1.

Table 1 .
Mitochondrial gene sequences used in angiosperm phylogenetics.