Aliso: A Journal of Systematic and Floristic Botany Aliso: A Journal of Systematic and Floristic Botany

Parsimony analyses of DNA sequences from the plastid genes atpB and rbcL were completed for 173 species of Orchidaceae (representing 150 different genera) and nine genera from outgroup families in Asparagales. The atpB tree topology is similar to the rbcL tree, although the atpB data contain less homoplasy and provide greater jackknife support than rbcL alone. In combination, the two-gene tree recovers five monophyletic clades corresponding to subfamilies within Orchidaceae, and fully resolves them with moderate to high jackknife support as follows: Epidendroideae are sister to Orchidoideae, followed by Cypripedioideae, then Vanilloideae, and with Apostasioideae sister to the entire family. Although this two-gene hypothesis of orchid phylogeny is an improvement over all single-gene studies published to date, there is still no consensus as to how all the tribes of Epidendroideae are related to one another. Nevertheless, these new topologies help to clarify some of the anomalous results recovered when rbcL was previously analyzed alone, and demonstrate the value of continued plastid gene se quencing within Orchidaceae.


INTRODUCTION
Orchidaceae are distributed throughout the world, and are by far the largest family of monocotyledons, with more than 750 genera and 30,000 species currently recognized. This immense diversity of taxa coupled with the complexity of orchid flowers presents a great challenge to plant systematists concerned with phylogeny reconstruction and classification (Atwood 1986). For this reason, molecular sequence data have been a boon to orchidologists, since it allows for large amounts of data to be generated for many taxa, and with relatively minimal cost, time expenditure, and difficulty.
The first published molecular phylogeny for Orchidaceae employed plastid rbcL sequences from 33 orchids plus 62 other lilioid monocots (Chase et al. 1994). Although limited in taxon sampling, that study showed that the neottioid orchids are polyphyletic, and thus implied that the orchids might be divided best into five subfamilial lineages: the apostasioid and cypripedioid orchids (each sometimes treated as distinct families), orchidoids (including the diurid and spiranthoid orchids), and epidendroids (including the neottioid and vandoid orchids). Only Vanilla Plum. ex Mill. and Pogonia Andrews were sampled to represent the vanilloid orchids, but this lineage showed itself to most likely represent a distinct subfamilial clade (Vanilloideae) as well.
The rbcL matrix used by Chase et al. (1994) was expanded substantially by Cameron et a!. (1999) to include 158 ingroup and 13 outgroup taxa. Since that study contains the greatest genus and species level sampling to date, it has remained a standard to which most subsequent molecular phylogenetic studies of orchids have been compared (e.g., van den Berg et al. 2005). A handful of additional phylogenetic studies focused on the entire Orchidaceae have been published using genes other than rbcL (e.g., ndhF by Neyland andUrbatsch 1995, 1996; 18S by Cameron and Chase 2000;matK by Freudenstein et al. 2004), but only the mitochondrial nadl b-e intron published by Freudenstein et a!. (2000) and Freudenstein and Chase (200 1) have considered enough taxa (ca. 100) to adequately depict higher level orchid relationships. Unfortunately, the level of sequence variation exhibited in the mitochondrial nadlb-c intron was also insufficient for addressing relationships much below the rank of subfamily or tribe.
Since no clear consensus has been reached regarding relationships among the subfamilies, tribes, or subtribes of Orchidaceae, a return to the plastid genome as a source of new phylogenetic information is warranted and justified given the well-known advantages of working with chloroplast DNA. These have been discussed previously by Palmer et al. (1988), Clegg and Zurawski ( 1992), Olmstead and Palmer (1994 ), among others, and include unilateral inheritance, numerous copies per cell, ease of amplification and sequencing, absence in animals and fungi, etc. Although there are a number of plastid loci that could have been chosen, atpB was sequenced to complement the previously published rbcL data of Cameron et a!. ( 1999) since its utility for family level systematics studies has been documented for a number of other plant groups Savolainen et al. 2000). This gene is located downstream from rbcL in the large single copy region, and encodes one of the subunits of ATP synthase, an enzyme that couples proton translocation across membranes during the synthesis of ATP. Furthermore, a coding region was desired for sequencing so that issues of alignment (i.e., character homology assessment) would be minimized for this diverse assemblage of taxa.

Taxon Sampling and Gene Sequencing
Nearly complete DNA sequences for atpB (ca. 1500 basepairs [bp]) were obtained from 173 species of Orchidaceae Cameron ALISO (representing I 50 different genera) and nine genera from Asteliaceae, Blandfordiaceae, Boryaceae, Hypoxidaceae, and Lanariaceae to serve as outgroup taxa. In order to fill in taxonomic gaps and to compare these data with the published rbcL phylogeny of Orchidaceae (Cameron et al. 1999), an additional 30 new sequences of rbcL were also completed; the remaining rbcL sequences were downloaded from GenBank as indicated in Table 1. This resulted in completely congruent matrices for all 182 taxa at the generic level. An effort was made to sequence atpB from the same species as previously done for rbcL (in many cases the same DNA aliquot was used), but this was not always possible (see Table 1). The entire data matrix is available from the author upon request or can be downloaded from The New York Botanical Garden website at http: //www.nybg.org/bscil res/cullb/dna.html (May 2005).
All of the newly generated sequences were produced by automated methods, briefly described as follows. Most of the total DNA was extracted using the FastPrep®l (Qbiogene, Inc., Carlsbad, California, USA) and glassmilk method from approximately 0.5 cm 2 dried leaf tissue, as described by Struwe et al. (1998). In some cases, DNA aliquots were obtained from the Royal Botanic Gardens, Kew DNA bank (see Table I). Target loci were amplified in 50 f.lL volumes using standard polymerase chain reaction (PCR) protocols that typically included the addition of bovine serum albumin (BSA) and betaine, which may or may not have been necessary. The annealing temperature used to amplify both genes was 55°C. For atpB, primers originally published by Hoot et al. (1995), herein designated "nyl72": TATGA-GAATCAATCCTACTACTTCT and "nyl73": TCAGTA-CACAAAGATTTAAGGTCAT, were used for amplification. Both of these together with "nyl74": AACGTACTCGTGA-AGGAAATGATCT and "nyl75": TAACATCTCGGAAA-T A TTCCGCCAT were used for sequencing. For rbcL, primers "ny35": CTTCACAAGCAGCAGCTAGTTC and "nyl49": ATGTCACCACAAACAGAAAC were used for amplification together with "ny23": GCGTTGGAGAGAT-CGTTTCT and "ny28": TCGCATGTACCYGCAGTTGC for sequencing. In all cases, resulting PCR products were purified using QIAquick®l spin columns (QIAGEN, Inc., Valencia, California, USA) according to the manufacturer's protocols. Cycle sequencing reactions were performed using a combination of purified PCR template, primer, and BigDye Reaction mix (Applied Biosystems, Inc., Foster City, California, USA) for 20 cycles. These reactions resulted in complete forward and reverse strands of the genes for nearly all sequences. Centri-SepG5ll sephadex columns (Princeton Separations, Inc., Adelphia, New Jersey, USA) were used according to the manufacturer's instructions to remove excess dye terminators and primer from the cycle sequencing products. These were subsequently dehydrated, resuspended in a mixture of formamide and loading dye, and loaded onto a 5% denaturing polyacrylamide gel. Samples were analyzed on an Applied Biosystems ABI 377XL automated DNA sequencer, and resulting electropherograms were edited using Sequencher vers. 3.0 software (Gene Codes Corporation, Ann Arbor, Michigan, USA).

Phylogenetic Analyses
The individual atpB and rbcL matrices, as well as the combined two-gene matrix, were analyzed using the parsi-mony criterion in PAUP* vers. 4.0b10 (Swofford 2002) with gaps treated as missing data, characters weighted equally, and with DELTRAN optimization of characters onto resulting trees. The sequenced genera from Asteliaceae, Blandfordiaceae, Boryaceae, Hypoxidaceae, and Lanariaceae were specified as a monophyletic outgroup based on topologies uncovered in broader phylogenetic studies of monocots (Chase et al. 1995. Equally parsimonious trees were found by executing a heuristic search of 1000 random addition replicates using tree bisection and reconnection (TBR) branch swapping, but saving only five trees per replicate in order to discover possible "islands" of maximum parsimony (MP) (Maddison 1991 ). All trees obtained in the first round of searching were then used as starting trees for a second heuristic search using the same parameters, but this time saving all shortest trees (MULTREES option in effect) until a MAXTREE limit of 10,000 trees was reached. Support values for the relationships discovered by analysis of each matrix were calculated by performing jackknife (jck) analyses of 5000 heuristic search replicates using the TBR branching swapping algorithm and the following settings: 37% deletion, emulate "jac" resampling, one random addition per replicate, holding one tree, and saving two trees per replicate. Finally, a partition homogeneity test ( = incongruence length difference [ILD] test) was conducted in PAUP* to test for incongruence between the atpB and rbcL matrices.

Analysis of atpB
The strict consensus of all equally parsimonious trees discovered by independent analysis of the atpB data is presented as Fig. 1-2 with jackknife values >50% indicated. The atpB matrix contains 1504 characters of which 668 ( 44%) are variable and 459 (31% of total) parsimony informative. Analysis of these data resulted in more than 10,000 trees of maximum parsimony (length of 2499 steps, CI of 0.389, and RI of 0.730). A single atpB tree is presented as a phylogram in Fig. 3 to highlight variation in branch lengths. Overall, the strict consensus of these trees is similar to published phylogenetic reconstructions for Orchidaceae (e.g., Kores et al. 1997;Cameron et al. 1999;Freudenstein and Chase 2001). The five subfamilies recognized by Pridgeon et al. (1999) are monophyletic and each is supported by high jackknife support ranging from 87-100%. Epidendroideae are supported as sister to Orchidoideae s.l. (94% jck), and Cypripedioideae are sister to this pair (84% jck). The positions of Apostasioideae and Vanilloideae among the subfamilies are unresolved. In total, 80 clades receive jackknife support >50% (64 of these '275%).

Analysis of rbcL
A strict consensus topology similar to that obtained with atpB and with comparable resolution and jackknife support resulted from independent analysis of the rbcL matrix (tree not shown). In this case, the matrix consists of 1330 characters; 605 (46%) of these are variable and 373 (28% of total) are parsimony informative. Again, the analysis yielded more than 10,000 equally parsimonious trees with a length of 2217 steps, CI of 0.368, and RI of 0.687. The trees are    ., 0.: I"

LISTERINAE
(    Chase, Freudenstein, and Cameron (2003) are indicated in boldface where applicable, and jackknife values >50% are given for supported clades. The tree continues in Fig. 2.      based on ca. 10% greater taxon sampling than the rbcL consensus tree for Orchidaceae published by Cameron et al. (1999), but differ only slightly. Although there is no jackknife support for the relationships among the five subfamilies, they are fully resolved as follows: Epidendroideae are sister to Orchidoideae, Vanilloideae are sister to this pair, followed by Cypripedioideae, and with Apostasioideae sister to all the remaining orchids. Overall support within the family (66 clades >50%; only 47 of these :?:75%) is considerably less than found with atpB alone.

Combined Two-Gene Analysis
Using P < 0.01 as a significance threshold for the partition homogeneity test (Cunningham 1997), the rbcL and atpB data sets cannot be considered incongruent (P = 0.02).
Moreover, there are no strongly supported clades in conflict between the rbcL and atpB trees. For these reasons, the two matrices were combined to create a matrix of 2834 total characters. Once again, the MAXTREE limit of 10,000 trees was reached, at which point the analysis was aborted. The CI and RI values (0.374 and 0.705, respectively) are comparable and intermediate to those obtained by the individual gene analyses, but overall resolution and the number of clades supported by the jackknife (especially :?:75%) increased considerably when the data were combined. The two-gene tree is based on 1273 variable characters, of which 832 are phylogenetically informative. Table 2 shows a comparison of all data matrix and tree statistics. The atpB + rbcL strict consensus tree with jackknife values is presented as Fig. 4-5 and shows strong support for each of the five subfamilies. Epidendroideae are sister to Orchidoideae, Cypripedioideae are sister to this pair, followed by Vanilloideae, and with Apostasioideae sister to the entire Orchidaceae. These intersubfamilial relationships are supported by the jackknife analysis, but the placements of Cypripedioideae and Vanilloideae are only weakly so (67-68% jck). DISCUSSION Chase, Freudenstein, and Cameron (2003) recently proposed a new classification system for Orchidaceae based on a variety of molecular phylogenetic studies focused at the 0.730 0.705 80/64 112/75 subtribe, tribe, and family level (including the data presented here). No fewer than five subfamilies and 17 tribes were recognized (2 in Vanilloideae, 4 in Orchidoideae, and 11 in Epidendroideae). Four tribes of the latter (Cymbidieae, Epidendreae, Podochileae, and Vandeae) were subdivided further into 26 subtribes. Within Orchidoideae, there were at least 15 subtribes recognized. For the purpose of this discussion, their new classification system will be followed with only slight modifications where noted.
In many ways, the combined atpB + rbcL tree presented here is not greatly different from the rbcL tree published by Cameron et al. (1999). The two studies are difficult to compare, however, in that successive weighting was used to reduce the total number of equally parsimonious trees and increase resolution in the rbcL consensus tree. Nevertheless, it is obvious that the addition of these new atpB data to the rbcL matrix gives a much clearer picture of orchid phylogeny. The atpB sequence is 174 bp longer than rbcL and consequently provides 86 more phylogenetically informative characters. The atpB data also contains less homoplasy than the rbcL (CI of 0.324 vs. 0.285 when uninformative characters are excluded), and the tree receives greater jackknife support (80 vs. 66 clades >50%). The possibility exists that part of the inequity between these two matrices is due to the fact that these rbcL sequences were completed nearly 10 years ago by less precise manual methods that were more prone to human error than the automated methods of today.
atpB Sequence Anomalies Length differences were encountered for atpB sequences in select orchids. The particular insertions/deletions (indels) were few and not coded separately since most appear to be isolated events of no phylogenetic significance. Species of Vanilla are an exception as they all share a six bp insertion at the 3 '-end of the gene. The most dramatic cases of sequence anomaly are seen in Earina Lindl., which has a divergent atpB sequence ca. 12 bp shorter than in other taxa. Near the 5' -end there is a gap of 16 bp, but this is followed downstream by a seven bp insertion, then a nine bp deletion. The gaps resulting from these indels (treated as missing data) may explain the instability of Earina in the resulting trees.  Fig. 5. Subfamilies, tribes, and subtribes sensu Chase, Freudenstein, and Cameron (2003) are indicated in boldface where applicable, and jackknife values >50% are given for supported clades.
The sequence of Spathoglottis Blume is also an extreme in that it is ca. 12 bp longer than other taxa; it contains one insertion of five bp and another of seven bp.
In contrast, full length atpB sequences were amplified from the mycoheterotrophic and mostly achlorophyllous vanilloid genera Cyrtosia Blume, Erythrorchis Blume, and Pseudovanilla Garay. A sequence from the genus Galeola Lour. was not included in this analysis because of missing data for rbcL, but even that taxon yielded an intact atpB sequence that placed it sister to Cyrtosia, as expected based on morphology. Such has not been the case for rbcL or psaB in which pseudogenes have been documented for Cyrtosia (Cameron 2004), and for which Galeola has resisted amplification. Whether or not ATP synthase is functional in these nonphotosynthetic orchids is uncertain. Overall, atpB sequence divergence is predictably low among genera of Epidendroideae, but extremely high among genera of Vanilloideae, as documented for nuclear , mitochondrial (Freudenstein and Chase 2001), and other plastid genes (Cameron et al. 1999;Cameron 2004) as well.

Subfamily Relationships
One significant difference between the rbcL tree and the atpB tree is the position of Cypripedioideae relative to the other subfamilies. In the case of rbcL, this monophyletic subfamily of diandrous orchids is sister to all other orchids except Apostasioideae. Both the atpB tree and the combined tree (Fig. 2) place Cypripedioideae as sister to the Epidendroideae + Orchidoideae clade. Cameron and Chase (2000) also recovered this arrangement with l8S data. Most orchid systematists have considered Cypripedioideae to be only slightly less "primitive" than Apostasioideae on account of their terrestrial habit, two fertile anthers, abscission layer between perianth and ovary, pollen monads, and crustose seeds enclosed within fleshy trilocular fruits in Selenipedium Rchb. f., a supposedly relictual genus. However, Atwood (1984) argued against this view of their "primitiveness" and Dressler (1986) at one time felt very strongly that Cypripedioideae was the sister group to what was then treated as subfamily Neottioideae (i.e., Epipactis Zinn, Listera Adans., and their relatives, which are now considered "lower" Epidendroideae). He cited evidence in the form of shared seed structure, cytology, and habit between these groups. It is worth pointing out (Fig. 1) that Cypripedium L., not Selenipedium, is sister to all other taxa of Cypripedioideae (94% jck) according to the atpB and combined data (its position is unresolved with rbcL alone). Some of the presumably plesiomorphic characters of Selenipedium may in fact be secondary gains. Moreover, members of Vanilloideae exhibit just as many, or even more, plesiomorphic characters as Cypripedioideae, and their single anther is developmentally not homologous with that observed in Epidendroideae/Orchidoideae (Freudenstein et al. 2002). Hence, the reversed positions of Cypripedioideae and Vanilloideae may not be so surprising as they seem at first glance.

Relationships of Problematic Taxa
Within each subfamily there are one or more genera whose position in the orchid phylogeny continues to be un-stable. For example, the relationships among the four major lineages of Vanilleae are fully resolved and supported by atpB, but not in the two-gene tree. Moreover, the relationships of Cleistes divaricata and Isotria Raf. to the other genera of Pogonieae in the atpB tree are unlike any other topology seen before (e.g., Cameron and Chase 1999). Sequence divergence among all these vanilloid orchids is very high, and the possibility is real that long-branch attraction may be a factor in this clade. The positions of Chloraea, Codonorchis Lindl., Megastylis glandulosa, and Pterostylis R. Br. of Orchidoideae are ambiguous in these trees as well. These genera exhibit a number of plesiomorphic characters, and are possible descendents of an ancient ancestor(s) distributed across Gondwana. Today they are isolated relicts of disjunct lineages distributed in Chile, New Caledonia, and Australia. The other species of Megastylis Schltr. are firmly embedded within Diurideae, making the genus polyphyletic, but M. glandulosa shows an unexpected affinity with Pachyplectron Schltr. (both endemic to New Caledonia) according to the atpB data (Fig. 4). With rbcL (Cameron et al. 1999) and psaB (Cameron 2004) Megastylis glandulosa is sister to Chloraea, whereas Pachyplectron is allied to Goodyerinae-a more logical arrangement based on column morphology, pollen structure, and other floral features. The atpB tree places Chloraea sister to all other genera of Orchideae, whereas the combined tree relocates the genus to the base of Cranichideae (Fig. 4). Neither position is supported by the jackknife. Likewise, Codonorchis is either sister to all Orchideae or unresolved among the tribal branches of Orchidoideae. Recognition of Codonorchideae, in either case, is probably justified since this monotypic genus with whorled leaves is morphologically unique in the family. Greater character and taxon sampling may help to settle these mobile taxa in future analyses.
It is difficult to make firm conclusions regarding relationships among the major lineages of Epidendroideae since resolution and jackknife support is poor in this group. Nevertheless, a few sister relationships in the subfamily are worth pointing out (Fig. 2). An alliance between Malaxideae and Dendrobieae (both with naked pollinia) continues to hold in these trees, just as it did with rbcL (Cameron et al. 1999). Podochileae may also be closely related to them. Polystachya seems firmly positioned now as sister to Vandeae (72% jck). Cameron et al. (1999) expressed concern for the placement of Polystachya near Laeliinae in their rbcL tree, and felt that it might represent a spurious result of incomplete taxon sampling. The rbcL sequence of Polystachya contains some 30 bp of missing data, and is one of the most divergent in Epidendroideae. It is probably of dubious quality and should be resequenced. Chase, Freudenstein, and Cameron (2003) proposed Collabiinae as a subtribe to encompass several genera typically classified as part of Arethuseae (Fig.  5). The two-gene tree shows that this subtribe is monophyletic and clearly unrelated to the core Arethuseae. It may or may not be part of Epidendreae, but the monophyly of that tribe is not resolved here. More sampling is needed in the form of genera such as /sochilus R. Br. or Ponera Lindl., which may help to bring these groups together, since van den  identified them as basal members of the Laeliinae clade. Likewise, it may be wise to sequence Cremastra Lindl. and/or other members of Calypsoeae for

Apostasioideae
Outgroup families future studies, because that tribe continues to be polyphyletic.

Conclusions and Future Directions
The addition of these new atpB characters to the rbcL matrix of Cameron eta!. (1999) gives a much clearer picture of phylogenetic relationships within Orchidaceae since the overall two-gene tree's resolution and jackknife support levels are increased substantially over either individual gene tree. Perhaps the greatest value of these new data is in documenting that although they are relatively conserved, collecting additional plastid gene sequences for Orchidaceae is worth the effort. They are easy to sequence and avoid many of the pitfalls encountered with sequencing nuclear, mitochondrial, or more variable plastid regions (especially issues of alignment). Certainly, other plant systematists have found the combination of several plastid genes to be a profitable strategy for improving hypotheses of phylogeny (Graham and Olmstead 2000;Reeves eta!. 2001;Sytsma eta!. 2002), and so the next step in this program of research will be to combine the rbcL and atpB data with a third plastid gene (psaB) for the same set of taxa. Ultimately, the fundamental issues of orchid origins, speciation, and coevolution with animals, fungi, and other plants can be addressed more objectively when a robust phylogeny for the family is in hand.