Aliso: A Journal of Systematic and Floristic Botany Aliso: A Journal of Systematic and Floristic Botany Phlogeny and Biogeography of the Prayer Plant Family Phlogeny and Biogeography of the Prayer Plant Family

Marantaceae are the second largest family in the order Zingiberales, with approximately 31 genera and 535 species. Earlier studies based on morphological and molecular characters could not confidently determine the relationships among major lineages of the family, nor could they identify the basal branch of the family tree. Phylogenetic analyses of DNA sequence data from all three genomic com partments (chloroplast: matK, ndhF, rbcL, rps16 intron, and trnL-trnF intergenic spacer; mitochon drion: cox!; nucleus: ITS region and the 5'-end of 26S) for a restricted set of taxa were conducted under parsimony criteria to define the root node and to assess geographical distribution patterns. Our results support the recognition of five major lineages, most of which are restricted to a single geo graphical region (tropical America, tropical Africa, or tropical Asia). The phylogenies and character reconstructions (Fitch parsimony optimization, Bremer ancestral areas, and DIVA) support an African origin for the family, followed by a minimum of two dispersal events to the New World tropics and four or more dispersal events to the Asian tropics. Less likely are two alternative hypotheses: (I) vicariance of a western Gondwanan group (the Americas and Africa) followed by several dispersals to Asia and Africa, or (2) an American origin followed by several dispersals to Africa and Asia. The low specific diversity in Africa may be due to higher extinction rates as a result of shrinking lowland tropical forests during the Tertiary.


INTRODUCTION
The family Marantaceae includes 31 genera and ca. 535 species distributed throughout warm temperate and tropical regions of the world (Andersson 1998), with 14 genera in the New World, 11 in Africa, and 8 in Southeast Asia. Halopegia K. Schum. is the only transcontinental genus with three representatives in Africa and one in Southeast Asia. Species distributions are significantly skewed with the vast majority of the species (ca. 450) restricted to the New World. The numbers of species listed below generally follow Andersson (1998), but may not reflect the current estimates of diversity for some genera.
New World taxa (14 genera/ca. 450 spp.) include the most species-rich genus of the family Calathea G. F. W. Meyer with as many as 300 species. Also found in the New World are Ctenanthe Eichler with ca. . Thalia is restricted to the New World tropics except for a single species (T. geniculata L.) that is naturalized in both Africa (Andersson 1981) and India (Sijimol et a!. 2000). The three monotypic genera from this region are Koernickanthe L. Andersson, Myrosma L. f., and Sanblasia L. Andersson. At least one genus (Calathea) has been shown to be polyphyletic and should be divided into two genera (Prince and Kress in press).
The African genera include Afrocalathea K. Schum.  Milne-Redhead (1950, 1952. A few new African species and subspecies of Marantochloa have been described (D'Orey 1981;Dhetchuvi 1996) and phylogenetic relationships within some genera are currently being investigated (A. Ley pers. comm.). As stated above, the genus Halopegia occurs in both Africa (2 spp.) and Asia (1 sp. from Indonesia).
The Asian taxa (8 genera/ca. 47 spp.) are the least understood, but are currently under investigation using a combination of molecular and morphological methodologies (Suksathan and Borshenius pers. comm.). For tropical Asia, Andersson (1998)  ). The number of species recognized will likely increase and generic circumscriptions of Phacelophrynium, Phrynium, and Schumannianthus will require modification based on the results of recent investigations using molecular data (Prince and Kress in press; P. Suksathan and F. Borchsenius pers. comm.). ALISO The family has been divided into two tribes based on the number of fertile locules: Maranteae with one fertile locule, and Phrynieae with three fertile locules per ovary (Petersen 1889;Loesener 1930a). The results of Andersson (1998), Andersson and Chase (2001), and Prince and Kress (in press) suggest the classifications based on number of fertile locules do not adequately reflect phylogenetic relationships in the family. Andersson (1998) published an informal classification based on an investigation of a number of morphological, anatomical, and biochemical characters. He recognized five groups plus five genera of uncertain affinity. His Phrynium group was primarily Asian and included Monophrynium, Phacelophrynium, Phrynium, Stachyphrynium, and the African genus Ataenidia. The Donax group was primarily African and included Hypselodelphys, Megaphrynium, Sarcophrynium, Trachyphrynium, and the Asian genera Donax and Schumannianthus. The Myrosma group was entirely American and included the genera Ctenanthe, Hylaeanthe, Myrosma, Saranthe, and Stromanthe. The Calathea group was composed entirely of the American taxa Calathea, /schnosiphon, Monotagma, Pleiostachya, and Sanblasia. Finally, his Maranta group was a mixture of American (Koernickanthe, Maranta, Monophyllanthe) and African (Afrocalathea, Marantochloa) genera. The taxa of "uncertain affinity" included representatives from all three continents (Asia: Cominsia and Halopegia; Africa: Halopegia, Haumania, and Thaumatococcus; the Americas: Thalia).
The validity of the earlier formal and informal classifications was tested by Andersson and Chase (200 1) using cladistic analyses of plastid rps 16 intron data. Their data strongly refute the earlier classifications based on the number of fertile locules and provide support for portions of the Andersson (1998) informal classification. More recently, Prince and Kress (in press) also tested the validity of earlier classification systems using a different set of molecular characters from the plastid genome: the trnK intron and the trnL-trnF intergenic spacer region (IGS). Based on those results, a different informal classification of five major clades was proposed (Table 1).
A few genera (Monophrynium, Monophyllanthe, and Sanblasia) were not included in the analysis and therefore could not be placed in any of the five groups. Additionally, four potentially para-or polyphyletic genera were identified: Calathea, Marantochloa, Phacelophrynium, and Schumannianthus.
Biogeographic studies of Marantaceae are limited. Raven and Axelrod (1974) may have been the first to formally suggest a western Gondwanan origin for the family based on the distribution of members of the tribe Phrynieae in both South America and Africa. Biogeographical patterns were investigated by Andersson and Chase (2001) using results from Bremer's (1992) ancestral area methods. They found a rescaled gains-to-losses index of 1.0 for Africa, 0.5 for Asia, and 0.4 for the neotropics. From this, they concluded the primary center of diversity for Marantaceae is Africa. They also proposed three possible scenarios for the present distribution: (1) migration out of an ancestral area prior to the split-up of tropical continents, (2) long-distance dispersal after the split, or (3) a combination of scenarios 1 and 2. They were unable to discriminate between the various scenarios  Prince and Kress (in press) due to the lack of resolution and poor statistical support for the relationships among the five clades.
The goal of the current study is to utilize additional molecular data from all three genome compartments in order to: (1) better resolve relationships of the major clades, (2) provide better statistical support for the basal branches of the family phylogeny, and (3) infer biogeographic patterns. The genomic regions sampled include coding or structural regions (coxl, 5.8S, 26S, matK, ndhF, and rbcL) that would be easy to align across all taxa, but which might provide too few characters to fully resolve relationships within Marantaceae. Noncoding regions (rps16 intron, trnL-trnF intergenic spacer) were included to provide additional resolution within Marantaceae, but proved difficult to align across the order.

Taxa and DNA Regions Examined
Taxon sampling was predominantly from the living collections, many of which were wild collected, of the Smithsonian Institution Botany Research Greenhouses (Table 2). Sampling includes multiple representatives of each of Andersson's groups ( 1998) and representatives of all of the taxa of "uncertain affinity," as well as the five clades of Prince and Kress (in press). A total of 25 Marantaceae taxa and one representative for each of the other seven families in Zingiberales were sampled. Sampling within Marantaceae included 26 species representing 19 genera as currently circumscribed. This represents approximately 61% of the genera and 6% of the species in the family. Taxa included in the analysis were chosen as exemplars of each of the five major clades identified by the earlier analysis of the family (Prince and Kress in press) that sampled 27 genera (87%) and 80 species (18%). Earlier studies provided two of the seven data sets (matK and trnL-trnF IGS) used here.
Genomic regions for analysis were selected to span a range of evolutionary rates. The regions sampled included: The mitochondrial cox I gene of many flowering plants includes a group 1 intron (Vaughn et al. 1995;Cho et al. 1998;Cho and Palmer 1999;Palmer et al. 2000). The intron encodes a homing endonuclease (Bonitz et al. 1980;Delahodde et al. 1989;Sargueil et al. 1990;Belfort and Roberts 1997) that cuts the gene in a precise position, leaving a small co-conversion track region within the coding portion of the gene as a result of DNA repair mechanisms. The co-conversion track (21 bp) was also excluded from all analyses. Data was also collected for the nadl intron, rpll6 intron, and the trnE-trnT IGS, but the alignments were so ambiguous that these data sets were excluded from the analysis.

DNA Extraction, Amplification, and Sequencing
Total genomic DNAs were extracted, amplified, and cycle sequenced following methods described in Kress et al. (2002) using Applied Biosystems (Foster City, California, USA) Big-Dye III (118 concentration) chemistry Terminator Cycle Sequencing Ready Reaction Kit. Both new and previously published primers for PCR and sequencing were used (Table 3). Amplification for ndhF was done in two parts following the methods of Pires and Sytsma (2002). All new sequences were generated on an ABI 31 00 Genetic Analyzer (Applied Biosystems) at Rancho Santa Ana Botanic Garden. Both strands were sequenced with a minimum of 95% overlap unless otherwise indicated (Table 2). DNA fragments were compiled and edited in Sequencher 3.1.1 (Gene Codes Corporation, Ann Arbor, Michigan, USA), aligned manually in Se-Al vers. 2.0all (Rambaut 1996), and imported into PAUP* vers. 4.0bl0 (Swofford 2002) for analysis. Alignment was relatively unambiguous for coding and other highly constrained regions (nuclear ribosomal genes, cox1, matK, ndhF, and rbcL), but not so for intergenic spacer regions and introns. Alignment for coding and structural regions required the insertion of a number of in-frame gaps. Gaps were also introduced to the IGS and intron matrices. Additionally, ambiguously aligned regions were present in the rpsl6 matrix (nine regions totaling 366 characters, or ca. 31% of the rna-trix) and the trnL-trnF IGS matrix (six regions totaling 73 characters, or ca. 14% of the matrix). Analyses in which gaps were coded, ambiguous areas included, or protein-coding sequences translated to amino acid sequence will be presented elsewhere. Ambiguously aligned regions were excluded from all analyses presented here.

Phylogenetic Analyses
The families of the order can be divided into two groups, the "banana families" (Heliconiaceae, Lowiaceae, Musaceae, and Strelitziaceae) and the "ginger families" (Cannaceae, Costaceae, Marantaceae, and Zingiberaceae). Representatives of all families of Zingiberales were included in the analyses, with Musaceae representative Musella (Franch.) C. Y. Wu as the defined outgroup taxon for all "order" analyses based on earlier work by Kress et al. (2001). Additional analyses using only members of the ginger families were also conducted in which Siphonochilus J. M. Wood and M. Franks (Zingiberaceae) and Costus L.
Maximum parsimony.-Separate and combined Fitch parsimony analyses (Fitch 1971) of all aligned sequence data (excluding ambiguously aligned regions) were conducted. Parallel analyses were run, one set using Musella as the designated outgroup, another with representatives of the banana families excluded, and Siphonochilus + Costus as the designated outgroup. For each analysis 500 random sequence addition replicates were conducted with tree-bisection-reconnection (TBR) branch swapping, saving all shortest trees.
Estimates of support.-Statistical support for branches was estimated via jackknife (JK: Penny and Hendy 1985) and Bayesian analyses. Jackknife estimates were run on all data sets using the fast JK methods with a large number (10,000) of replicates with 33% replacement . Bayesian analyses were conducted in MrBayes 3.0 (Huelsenbeck and Ronquist 2001) using three replicates of 5-million generations (sampling every 100 generations) for each of the individual data partitions and one combined data set as indicated in the maximum parsimony description above. The first 40 likelihood values (generations 1-4000) were included as the first data partition. Appropriate burn-in times for each analysis were determined by treating each successive half-million generations (= 5000 trees) as a data pool from which 40 likelihood tree values were randomly drawn, creating a total of 11 "generation" samples per Bayesian analysis. Generation sample likelihood values were subjected to a Bartlett's test for homogeneity of variance (Bartlett l937a,b) to determine whether the data were heteroscedastic (as expected due to the inclusion of the first 40 data points). A Tukey-like multiple comparison test for differences among variances (Levy 1975a, b) was used to determine where the change from heteroscedasticity to homoscedasticity occurred. Only the homoscedastic data from the latter part of each MrBayes run were used to calculate posterior probabilities. In all cases at least the first 500,000 generations were discarded as burn-in.

Biogeographical Reconstruction
Studies by Kress and Specht (2006) provide an estimate of at least 95 millions of years ago (mya) for the divergence ~ Table 2. Taxa sampled to examine biogeographical patterns in Marantaceae with associated voucher and GenBank numbers. .j:>.

GenBank accession numbers Taxon
Voucher matK ndhF rbcL rpsl6 Redh. ( Costaceae Trimmed strict consensus trees from this and earlier studies (Andersson and Chase 2001;Prince and Kress in press) were used to generate a master tree matrix of 37 taxa representing 28 ingroup genera and three outgroup taxa (Cannaceae, Zingiberaceae, and Costaceae ). Multiple representatives of the paraphyletic genera Calathea (6 taxa) and Schumannianthus (2 taxa) were included. All other genera of Marantaceae were treated as monophyletic, since prior studies failed to provide strong statistical support (?:70% jackknife or bootstrap) of paraphyly. Source trees were redrawn in MacClade (Maddison and Maddison 2000) then imported into PAUP* for matrix translation. Master matrix construction and analysis were also implemented in PAUP*. This method is an indirect supertree approach (Ponstein 1966;Ragan 1992;Bininda-Emonds et al. 2002). Phylogenetic analysis of the supertree matrix was identical to those described above for maximum parsimony.
The strict consensus tree (of 12 shortest trees) from analysis of the supertree matrix was saved and arbitrarily dichotomized (two nodes affected; fully resolved trees required for mapping in MacClade). An alternative topology, with Haumania as part of the Sarcophyrnium clade was also evaluated. A data matrix of a single character (geographic distribution: Africa, America, or Asia) was mapped onto the two trees under either ACCTRAN or DELTRAN optimization using MacClade (FPO analysis). Outgroup taxa (Costaceae and Zingiberaceae) were coded as polymorphic (Africa, America, and Asia). The same fully resolved trees were used in the AA analyses, and in DIVA analyses (one with "max areas = 2" and one with "max areas = 3").

Phylogenetic Analyses of Seven DNA Regions
The large number of maximum parsimony and Bayesian posterior probability (PP) analyses conducted prohibits their full discussion here, but a summary of all results is provided (Tables 4, 5), including tree statistics for various maximum parsimony analyses, JK percentages, and PP values. The plastid data sets provided the majority of the clade resolution (Table 5). This was expected given the conservative regions sampled from the nuclear and mitochondrial genome. Similarly, the mitochondrial and nuclear data sets produced phylogenies with low resolution, low statistical support for internal branches, and some anomalous relationships within the ginger families.
The use of Musella lasiocarpa (Musaceae) or Siphonochilus decorus (Zingiberaceae), plus Costus pulverulentus (Costaceae) as the designated outgroup did not alter ingroup topology. The inclusion of the more distantly related taxa (the banana families representatives: Musella lasiocarpa,

VOLUME 22
Heliconia irrasa, Orchidantha fimbriata, and Phenakospermum guyanense) did not alter relationships within the ginger families noticeably, but instead almost uniformly resulted in lower jackknife and posterior probability values, decreased branch resolution, and increased numbers of tree islands.
Overall, parsimony tree topologies within the ingroup were similar to those produced by Bayesian analyses with minor differences in resolution. As expected, combined analyses produced far fewer trees than individual analyses (2 vs. up to 3570; Table 4) and greatly improved clade support (jackknife and posterior probability values), but did not improve general tree indices (consistency index, retention index, and rescaled consistency index). The largest differences in tree topology were related to the method of data analysis. For example, in the nuclear data analyses the position of Cominsia changed depending on analysis method (parsimony vs. Bayesian) but with low statistical support (59-70% JK). Parsimony placed Cominsia sister to Monotagma with 70% JK support (an unexpected location) while Bayesian analysis placed it in a clade with Donax, Schumannianthus virgatus, and Phrynium (0.87 PP).
The ability of individual data sets to resolve critical clades (with significant statistical support) varied dramatically. The mitochondrial and nuclear data analyses did not resolve any of the expected family relationships (Table 5). This is consistent with the expectation of slow substitution rates in these portions of the plant genome (Wolfe et al. 1987;Palmer and Herbon 1989;Palmer 1990). Similarly, only portions of the five Marantaceae clades were resolved. The various plastid data sets were more successful in resolving within and among family relationships. In all cases the expected family relationships were resolved ((Zingiberaceae, Costaceae)(Marantaceae, Cannaceae)), and ingroup clades were also frequently resolved. Two "wild card" ingroup taxa were identified, Haumania and Thalia. Earlier studies (Prince and Kress in press) suggested Haumania was part of the Calathea clade (<50% JK support, <0.95 PP). This study placed Haumania in unresolved polytomies of a more basal position than the earlier study, or as a member of the Sarcophrynium clade (weakly supported, Table 5). Similarly, Thalia was placed within the Donax clade by Prince and Kress (in press) (80% JK, <0.95 PP), but was sometimes placed in the basal polytomies in this study.
Both combined analyses (Musella as outgroup, or Siphonochilus and Costus as the outgroup) produced the two shortest trees (one tree shown in Fig. 1; branches that collapse in the strict consensus tree are shown with dashed lines). These two trees differed in the relationship among the three major clades: the Calathea clade (CC), the Donax clade (DC), and the Maranta (MC) + Stachyphrynium (STC) clades. Bayesian analysis strongly supported (1.00 PP) the tree shown: ((CC, DC) (MC, STC)), over the alternative: ((CC (MC, STC)) DC). Parsimony favored the alternative topology but with weak (57%) JK support. The first diverging taxon in the Marantaceae tree, based on the combined seven region analyses, was Haumania followed by the Sarcophrynium clade.

Supertree Construction
Phylogenetic analysis of the supertree matrix resulted in twelve shortest trees. The trees were identical in topology with the exception of the resolution of Monotagma and Cominsia as indicated by an upward directed arrow in Fig. 2 and  3, and resolution within Calathea I. The figures shown are a summarized (five representatives of Calathea reduced to Calathea I and Calathea II), fully resolved version of the tree used for FPO, AA, and DIVA analyses of biogeography.

DISCUSSION
All prior estimates of relationships based on morphology and anatomy (Kirchoff 1983a, b;Kress 1990;Kress et al. 200 l) predict the following relationship among the ginger families: ((Zingiberaceae, Costaceae) (Marantaceae, Cannaceae)). The relationships among these four families have never been contested although Costaceae have been treated as a subfamily or tribe of Zingiberaceae in the past (Schumann 1904;Loesener 1930b;Hutchinson 1934Hutchinson , 1959Hutchinson , 1973, thus it can reasonably be considered a known phylogeny. Independent phylogenetic estimates that deviate from this topology may be due to homoplasy, lack of resolution due to limited sequence divergence (low power) or poor taxon sampling, or analysis limitations such as long-branch attraction (e.g., Soltis et al. 2000;Davis et al. 2004). Similarly, overwhelming morphological and anatomical data support the monophyly of each of the four families in the ginger families, including Marantaceae (Kirchoff l983a, b;Kress 1990;Kress et al. 2001). Finally, earlier molecular studies by Prince and Kress (in press) identified five major lineages within Marantaceae. Each data set was evaluated on its ability to resolve those family-level relationships and major clades as described above.
The addition of almost 5000 nucleotides from all three genome compartments to the earlier plastid data set (Prince and Kress in press) of 2543 characters provides improved resolution for relationships among the major clades of Marantaceae (Fig. 1). Although the additional plastid data sets are the source of most of the characters for tree building, the combined data set provides the best estimate of relationships in the family at this time. The recovery of the same taxa at the base of the family tree using multiple data sets under both Bayesian and parsimony criteria suggests the Haumania + Sarcophrynium clade as the root node or grade.
Additional studies will be needed to determine if Haumania is better included within the Sarcophrynium clade or as the earliest diverging taxon in the family. Both Haumania and all members of the Sarcophrynium clade grow in the African tropics, suggesting an African origin for the family. Results of supertree analyses also identify Haumania as the first     I  I  I  I  I  I  I  D  I  I  D   F   D  D  I  I  I  I  I  I  I  I  I  s  I  I Fig. 2 and 3 under Camin-Sokal ( 1965) parsimony ( = irreversible). Ancestral area value rescaled to a maximum value of I (Bremer 1992 branch of the Marantaceae family tree (see Fig. 2, 3), followed by the Sarcophrynium clade. These results are not surprising given the overlap in data sets used to generate the supertree matrix. Biogeographical analyses (FPO, AA, DIY A) differed slightly in their reconstruction of the basal nodes for Marantaceae. DIVA (max areas = 2 or 3), AA, and the FPO-ACCTRAN reconstructions suggest an African origin for the family when Haumania is treated as the first branch of the family phylogeny ( Fig. 2; Table 6). An African origin for the family was unexpected since the closest relatives, Canna (Cannaceae), are endemic to the New World (but see Tanaka 2001). This finding is similar to the publication of an African origin for the more distantly related family Zingiberaceae (Kress et al. 2002). Thus, the ginger families may form two pairs of families following parallel biogeographical paths, with the two larger families (Marantaceae and Zingiberaceae) originating in Africa, while their smaller sister families (Cannaceae and Costaceae, respectively) originated in the American tropics.
Inclusion of Haumania in the Sarcophrynium clade results in an American origin for the family only in the FPO analysis ( Fig. 3; both ACCTRAN and DELTRAN). There is currently no statistical support for this placement, but the scenario deserves some consideration. Similarly, some DIVA reconstructions (Fig. 2, 3) suggested a more widespread origin for the family, perhaps equivalent to western Gondwana (Africa and America). Kress and Specht (2006) suggest a divergence of Marantaceae and Cannaceae of at least 95 ::' ::: 5 mya, based on a local clock and three fossil calibration points within the order Zingiberales. This date is compatible with a western Gondwanan distribution for the ancestor of Cannaceae and Marantaceae, however, Marantaceae do not begin to diversify until ca. 63 ::' ::: 5 mya according to the local clock analysis. Although these numbers may change with the inclusion of additional data sets, we may conclude that Marantaceae are probably not a Gondwanan group, i.e., Marantaceae are not distributed pan tropically due to vicariant events ca. 110 mya (Kearey and Vine 1996).
The data presented here suggest that current distribution patterns for Marantaceae may be best explained by longdistance dispersal events. The number and direction of events differs depending on reconstruction method. DIY A analyses require 7-8 dispersal events to account for the current distribution patterns in the family. A large number of possibilities exist from stepping-stone progression around the globe, such as Africa to America to Asia followed by several back migrations.
The lack of resolution regarding migration direction is in sharp contrast to the obvious evidence for localized radiations. Four of the five major clades are strongly associated with a particular geographic region. These strong geographic associations suggest an early dispersal event followed by diversification. For example, one interpretation is that Africa is the source of all historic propagules. Ancient dispersal events brought plants to the Americas in two separate events with subsequent speciation for the Calathea clade and the New World representatives of the Maranta clade. Secondary dispersal into Asia occurred at least three times: (1) Donax clade, (2) Stachyphrynium, and (3) Schumannianthus virgatus and Halopegia pro parte, or as a series of dispersal events to Asia and recolonization events back to Africa.
The above scenario suggests extensive extinction events that are difficult to demonstrate. Specifically, Africa is home to far fewer species than Asia or the Americas. The reduced number of taxa in Africa may be due to extensive extinction as a result of severe climate change (eventual cooling and drying) during the Tertiary. Literature documenting the fossil flora of Africa is limited relative to North America, Europe, and northern Asia, however, plant communities can be reconstructed for several regions and times (Greenway 1970;Axelrod and Raven 1978;Ziegler et al. 1981;Ehleringer et al. 1991;Wing and Sues 1991;Retallack 1992;Cerling et al. 1993;Quade and Cerling 1995). Greenway (1970) specifically describes the shrinking lowland rainforest belts in Africa, the primary community where Marantaceae grow. In contrast, tropical habitats in the Pacific Ocean and between North and South America have been expanding due to the creation of island chains via uplift and volcanic activity.
In summary, the geographic history of Marantaceae is complex and uncertain. Early efforts using molecular phylogenies were hampered by poor resolution along the backbone of the family tree. Current data provides improved resolution and better estimates, although statistical support is still lacking for a few critical nodes. The family has likely undergone several dispersal events from Africa to both the New World and to Asia. The lower specific diversity in Africa may be due to extinction events associated with the constriction of tropical forests during the Tertiary. Extensive diversification, especially in the neotropics, may be due to habit expansion associated with mountain-building processes.