Missing Links Between Disjunct Populations of Androcymbium (Colchicaceae) in Africa Using Chloroplast DNA Noncoding Sequences

With the objective of clarifying some aspects of the biogeography, phylogeny, and taxonomy of the genus Androcymbium, we sequenced three chloroplastic DNA noncoding regions (trnL intron, trnLtrnF IGS, and trnY-trnD IGS). These data were analyzed with maximum parsimony and the ancestral areas methods following Bremer. Results show that Androcymbium is not monophyletic and that the origin of its distribution and speciation is situated in western South Africa. Later, it dispersed to North Africa, going first to eastern South Africa. Androcymbium austrocapense and A. roseum allow us to phylogenetically connect the species of western with eastern South Africa, and the southern species with the northern, respectively. The formation of an arid track in Africa at the end of the Miocene explains the colonization of Androcymbium in the Mediterranean basin. Androcymbium wyssianum is a key element in understanding colonization of the Canary Islands. The biogeographical pattern of distribution of Androcymbium fits with many other genera with similar disjunct distributions. This indicates the importance of the Miocene arid track in understanding the floristic connections between northern and southern Africa. Because of the close relationships of Bulbocodium, Colchicum, and Merendera, with Androcymbium inferred from the chloroplast data, restructuring the taxonomy and nomenclature of the tribe Colchiceae may be required.


INTRODUCTION
Studies of the biogeography of Africa have emphasized that the floristic relationship between arid zones of northern and southern Africa is one of the most intriguing phenomena in plant distribution (De Winter 1971). About 63% of the genera of northern African xerophytic flora, including many monocots, are found in the symmetric austral zone (Monod 1971). To explain this phenomenon, the importance of the role of an arid track (Balinsky 1962), established in the Late Miocene, in the biogeographical history of Africa has often been asserted (Axelrod and Raven 1978). This region may have been a migration corridor from southern to northern Africa (or vice versa) for some groups.
Molecular phylogenetic methods provide great potential for testing this argument and clarifying several aspects of biogeography and evolutionary biology of disjunctions in Africa. We believe that a reasonable understanding of diversification processes within the component taxa of a given flora provides the best basis for generalizations about the diversification of the flora as a whole.
There are recent examples of phylogenetic studies of several genera with similar geographic distributions; Leucas R. Br. (Ryding 1998), Lotononis (DC.) Eckl. & Zeyh. (Linder 1992), and Moraea Mill. (Goldblatt et al. 2002). Some of these authors argued in support of a biogeographical hypothesis of fragmentation for the one pan-African distribution of its taxa, while other authors put forward arguments for long-or short-range dispersals across the arid track. In some studies, these disjunctions have been established during the Miocene. Unfortunately, there is a paucity of molecular phylogenetic investigations of important African groups.
Thus, the role of the arid track in the biogeographical history of Africa is still poorly understood.
Androcymbium Willd. (Colchicaceae) consists of 56 hermaphroditic geophytes that exhibit a disjunct distribution between northern and southern Africa, with western South Africa as the center for taxonomic diversity. Previous phylogenetic analyses with morphology, allozymes, and chloroplast DNA restriction fragment length polymorphisms (cpDNA-RFLP) allow us to develop evolutionary hypotheses about relationships of these taxa on some disjunct areas of their distribution (North Africa and western South Africa) (Caujape-Castells et al. 1999Membrives 2000).
Our principal aim in this work is to present one phylogenetic hypothesis, including representative taxa from all areas of its distribution. For this we have considered samples of the four general areas of distribution (western South Africa, eastern South Africa, south-central Africa, and North Africa) of Androcymbium (Fig. I). Based on previous studies of morphology and life traits of Androcymbium of southcentral Africa and eastern South Africa (never before included in molecular analysis), some of these taxa could be the species that phylogenetically connect the populations of the northern and southern areas of Africa (missing links). Therefore, their inclusion in the phylogenetic tree was considered essential to better understand this disjunction.

Plant Material
We analyzed 75 populations belonging to 28 taxa from the genus Androcymbium (Table 1). Our taxon sampling represents a wide range of variation in Androcymbium and the entire geographic range for the genus across Africa. Si x different taxa of th e famil y Colchi caceae, a nd one taxo n fro m th e family Alstroemeriaceae (Bru mmi tt 1992), whi ch is phy logeneti call y c lose to Co lchicaceae (Bremer 2000;Yinnersten & Bre mer 200 l ;Yinnerste n and Reeves 2003), were used as outgroups (Tabl e I ).
All of th e a nalyzed samples for thi s study were planted and grown under the same conditio ns in the in ves ti gati on greenho use at the Marimurtra Botani c Garde n in Bl a nes, Spa in .

DNA Isolation, PCR A mp lification, and DNA Sequencing
Ge no mk DNAs were extracted fro m fresh leaf ti ssue, previous ly dried in s ili ca ge l fo ll o wed by snap freezing in liquid  nitrogen, using the CTAB method (Doyle and Doyle 1987) with some modifications (Li et al. 2001). The isolated DNA was resuspended in TE buffer (TRIS-EDTA).

Data Analyses
Sequence information of the three noncoding cpDNA regions were aligned using CLUSTAL_ W vers. 1.4 (Thompson et a!. 1994 ), and were tested and corrected by hand with Bioedit vers. 5.0.9 (Hall 1999). Gaps 2 base pairs (bp) or less were removed. Previous analyses of these cpDNA regions have demonstrated that insertions/deletions (indels) longer than 2 bp are not too prone to parallelism and thus may provide important phylogenetic information; whereas, homoplasy in indel distribution is almost completely accounted for by indels of I or 2 bp (van Ham et al. 1994;Bayer and Starr 1998). Therefore, the indels of 3 bp and longer were coded as binary character data (Simmons and Ochoterena 2000) using the GapCoder program (Young and Healy 2003).
The ILD test (Farris et al. 1995), implemented in PAUP* vers. 4.0b10 (Swofford 2002) as partition homogeneity test, was carried out to test the combinability of the three data sets.
We analyzed the phylogenetic relationships using maximum parsimony (MP) methods using PAUP*. The analyses were carried out with the heuristic search strategy with tree bisection reconnection (TBR), saving all shortest trees at each step (MULPARS), and branch swapping on all trees saved (STEEPEST descent). Multiple islands of equally most parsimonious trees were searched by the heuristic option with 100 random sequence additions. The consistency index (CI) and the retention index (RI) are presented to estimate the amount of homoplasy in the characters and the relative support for each clade was assessed by bootstrap analysis (Felsenstein 1985) with 1000 pseudoreplicates of the data and TBR branch swapping. In each replicate of bootstrapping, we limited the maximum number of trees to 5000.
The ancestral area analysis of Bremer (1992) was performed to study the geographic origin of Androcymbium.

RESULTS
The amplification of noncoding cpDNA sequences using universal primers has been shown successful for phylogenetic reconstructions at low taxonomic levels (Taberlet et al. 1991;Demesure et al. 1995). Phylogenetic studies based in noncoding cpDNA sequences have been successful at both interspecific (Gielly and Taberlet 1994; Bruneau 1996; Asmussen and Liston 1998) and intraspecific level (Dumolin-Lapegue et al. 1997;Petit et al. 1997). For this reason, the sequencing of noncoding cpDNA regions was chosen to create the phylogeny of the Androcymbium genus.

Sequence Data
Sequences were obtained from three cpDNA noncoding regions: trnL intron, trnL-trnF IGS, and trnY -trnD IGS. The average lengths of the combined cpDNA regions vary between 1267 bp (northern African species) and 1212 bp (western South African species) ( Table 2). Because of this, it was necessary to insert gaps to align sequences, increasing the total length of the aligned matrix (Table 3). These gaps can provide phylogenetic information. Some authors ignore these zones, losing much phylogenetic information when analyzing the data. Due to this, the gaps were coded as character data (Simmons and Ochoterena 2000) and then introduced into the analysis, resulting in a final 1690 bp matrix.
The length of sequences is correlated with geography: the western South African species possess the shortest, eastern South African taxa intermediate, and North African species the longest trnL intron sequences (Table 2). Androcymbium austrocapense populations of western and eastern South Africa have the same nucleotide substitutions and inde1 pattern as the rest of species from eastern South Africa. Androcymbium roseum subsp. roseum occurs in south-central Africa, but its sequences have similar length and the same nucleotide substitutions and indel pattern as those of species of North Africa.
The chloroplast region that possesses the largest percentage of parsimony-informative sites is the trnL-trnF IGS (7 .6% ). If the gaps are coded as character data and added to the parsimony-informative characters, it is found that the most phylogenetically informative region is the trnL intron. The least informative region is the trnY-trnD IGS (Table 3).
The trnY -trnD IGS region, never used before in phylogenetic studies, has a very unstable 101 bp zone. It is present or absent in different populations of different taxa of Androcymbium without any evident biogeographic or phylogenetic pattern. Hence, this unstable zone was removed from the analysis. In the outgroup taxa, this unstable region is always present.
In some cases, we found different DNA sequences within VOLUME 22 cpDNA Phylogeny of Androcymbium 611 "The trnY-trnD IGS region has a very unstable zone of 101 bp. This zone is present or absent in different populations and species of Androcymbium without any biogeographic or phylogenetic pattern. Due to this, the zone was removed from phylogenetic analyses and from this table. The outgroup does not show this phenomenon. the same Androcymbium species. Each different DNA sequence of the same species was identified as a haplotype.

Incongruence Length Difference Test
The Incongruence Length Difference (ILD) test (Farris et a!. 1994) was performed to test for conflicting signal among the three DNA data sets. The result was significant (P = 0.01) pointing out that there is evolutionary heterogeneity among the three data sets. If we test only the trnL intron and trnL-trnF IGS, the result is not significant (P = 0.45), indicating that significant incongruences cannot be detected between these two regions.
It has been pointed out that rejection of the null hypothesis of the ILD test may not be due to incongruence caused by different histories (Dolphin et a!. 2000). Wiens (1998) recommends analyzing the data sets separately and making one tree . with each data set. If there is no incongruence among the groups found in the trees analyzed separately, and the groups found in a tree made using the combined set, the data should be combined. We did not find incongruence between the tree topology with the separate data and with the combined data. Therefore we decided to combine the three data sets. Moreover, all three regions are linked and part of a nonrecombining chloroplast genome, providing ample justification for combining data sets.

Phylogenetic Analyses
The MP analyses with the data for the three regions combined produced a bootstrap strict consensus tree (Fig. 2). The phylogenetic tree is the result of I 000 resamplings where we limited the number of trees in each replicate to 5000. The tree length was 645 steps, and the consistency index (CI) and retention index (RI) were CI = 0.767 and RI = 0.737. No different islands were found in the phylogenetic analysis.
We can see that Androcymbium is not monophyletic in Fig. 2 because Bulhocodium L., Colchicum L., and Merendera Ramond, are nested within Androcymbium. These four genera, that form the Colchiceae tribe, are morphologically characterized by having subterranean, tunicate, bulb-like corms, flowers situated on a very short central stem, and long-clawed tepals. All have the alkaloid colchicine (Dahlgren et a!. 1985).
In the phylogenetic tree (Fig. 2) Androcymbium roseum subsp. roseum is found in Clade I, formed by northern African taxa. This species is largely distributed from the Orange River, in western South Africa, to Tanzania (south-central Africa), following the river courses. Given this distribution it is possible to connect the two disjunct regions on the African continent. This is also consistent with similarities in micro-and macro-morphological characters of the North African species and A. roseum (Martin et a!. 1993;Pedrola-Monfort 1993;Membrives 2000). The monophyly of taxa in Clade I also provides a phylageographic connection among species and populations of physically separated regions: the Atlantic coast of Morocco (A. wyssianum Hap. 2) with the Canary Islands (A. psammophilum and A. hierrense).
The Clade 2 species (BS 98%) of North Africa and eastern South Africa also are characterized by a series of synapomorphies ( Fig. 3; Table 4). These are different from western South African species, with the exception of A. austrocapense. The eastern and western populations of A. austrocapense share identical nucleotide and indels patterns with the eastern South African species of Clade 2.

Areas of Bremer
Results from Bremer's ancestral area method (1992) show that the highest gain-to-loss ratios (GIL), and their rescaled quotients (AA), are for the western South African region. This is followed by eastern South Africa and North Africa (Table 5). They are more easily compared by rescaling the GIL quotients to a maximum value of I. Rescaled quotients (AA; for estimating ancestral area) are obtained by dividing each G/L value by the maximum GIL found for each cladogram.

DISCUSSION
The most striking aspect of our results is that A. austrocapense and A. roseum phylogenetically connect the disjunct areas of Androcymbium in Africa. The topological position of these taxa in the phylogenetic tree (Fig. 2) matches the geographical distribution, following the west-east and ·southnorth axes.

Origin of Androcymbium
The ancestral area analysis, following Bremer (Table 5), provides support for western South Africa as the most probable region of origin for the genus. Other morphological (Membrives et a!. 2003a, b, c), palynological (Martin et a!. 1993;Membrives eta!. 2002b . 1999, 2001), support this hypothesis.

West-East South African Disjunction
In the ancestral zone of western South Africa, we find several lineages (Fig. 2), while all the species that occur in eastern South Africa form a clade, along with the North African species (BS 98% ). Within the clade that contains all the eastern South African species, we find the coast species A. austrocapense that inhabits both of the west-east disjunction regions. A recent study of polymorphism based on RAPDs (del Hoyo in prep.), indicates that the western populations of A. austrocapense have more molecular polymorphism. This can be used to infer a higher probability that western populations are older than the eastern. The ancestral area analysis also suggests that the eastern South African region is more modern than the western, but older than the rest of regions of this disjunction. If we look at the specific diversity of these regions, we find that of 56 Androcymbium species, 36 are located in western South Africa, 10 in eastern South Africa, and A. austrocapense occurs in both zones of South Africa. This asymmetry of species distribution is similar to many other genera of the African xerophytic flora, i.e., the genus Haemanthus L., with 21 species, 15 of which are found almost exclusively in western South Africa and with five in the east. Only the species H. albiflorus Jacq. occurs in both regions (Snijman 1984). This disjunction also occurs in other taxa, such as the genus Erica L., with 621 in western South Africa and 23 in eastern South Africa (Brown and Lomolino 1998), and in many other genera such as Crassula L. Eckl. & Zeyh. (Linder et al. 1992), and Moraea Mill. (Goldblatt et al. 2002). In all of these cases it can be observed that greater morphological diversity occurs in the western species than in the eastern ones. This biogeographic pattern, with a west to east direction, could serve as an evolutionary model for many other species with the same disjunct distribution, given that this disjunction is very common in many South African xerophytic genera.

South-North Disjunction
The geographical south-north disjunction of Androcymbium has 50 taxa in the southern African region (South Africa and south-central Africa), and 6 taxa in North Africa. This pattern is similar to many other genera with high species diversity in South Africa and that also have some species in the north, i.e., Erica with 644 taxa in the southern African region, but only 25 in North Africa and Europe (Brown and Lomolino 1998), or Moraea with nearly 200 taxa in South Africa and only one in the Mediterranean basin (Goldblatt et al. 2002). Other examples are Aloe L., Dracaena Vand., Echium L., Lobostemon Lehm., Olea L., and Pelargonium L'Her. These disjunctions could have originated by dispersal or vicariance. The dispersalist hypothesis explains the disjunct patterns of distribution by dispersion, due to the disappearance of pre-existing barriers; whereas, vicariance explains the disjunctions as the result of the appearance of barriers that fragmented the distribution of ancestral taxa. From the sequencing of three cpDNA noncoding regions and Bremer's analysis, we found that the North African species are derived from the South African ones. This suggests that this disjunction originated by dispersal, with western South Africa as the center of origin. The preexisting barrier was a tropical zone that developed into an arid corridor-the arid track. This corridor connected the south with the north of Africa in the Upper Miocene (Balinsky 1962).
The North African species that form Clade 1 (BS 93%), show a set of synapomorphies at sequence level that also occur in A. roseum subsp. roseum ( Fig. 3; Table 4). This species appears to provide evidence for the connection between South Africa and North Africa and could be the most probable ancestor of this latter species group. Androcymbium roseum is currently widespread in south-central Africa, in zones with arid conditions, occurring specifically in ravines and riverside habitats. Given its apparent inability to establish populations far from sites that experience periodic flooding, and the need for arid conditions for their development, it is possible to suggest that either this species or its ancestor could have arrived in the Mediterranean basin by following  Table 5. Ancestral area analysis, following Bremer (1992); a higher gain-to-loss ratio indicates a higher probability of being the ancestral area.  (Said 1981(Said , 1993, coinciding with the formation of the Miocene arid track. In previous work with cpDNA RFLPs (Caujape-Castells et al. 1999), an ancestral species is dated from North Africa at 12.1 ::' :: 2.8 million years ago (mya) (Upper Miocene). The relationships among A. roseum and the northern species is also supported by an enormous similarity in plant macromorphology (Membrives 2000), microfeatures of pollen (Martin et al. 1993), and seed coat (Pedrola-Monfort 1993), providing additional indirect evidence of the ancestral nature. Unlike A. roseum subsp. roseum, A. roseum subsp. albifiorum is included in the clade containing all the species of eastern South Africa, in addition to species of the Canary Islands, North Africa and the Near East ( Fig. 3; Table 4). Therefore, those groups made up of the species of North Africa and the species of eastern South Africa (Clade 2) are more closely related to each other than to the species from western South Africa. The center of origin of Androcymbium appears to be in western South Africa and the phylogenetic tree (Fig. 2) supports a southwestern to southeastern to northern Africa directionality of dispersal. This biological and geological evidence explains the pattern of distribution via dispersion. Collectively, this supports the Androcymbium dispersalist hypothesis starting from a center of origin situated in western South Africa with distribution to North Africa and the important role of the Miocene arid track and Eonile Canyon.
The disjunctions formed via dispersion may originate in two ways: by a single long-range event, or several progressive, short-range events. If long-range dispersal was a factor in the distribution of Androcymbium before desertification of Africa, then we would expect that some of the species in eastern South Africa must be more recent than their western and northern congeners. In our phylogenetic tree it is observed that the most recent species are the North African ones. Our study seems more consistent with the dispersion hypothesis of Androcymbium by multiple, progressive, and short-range events.

Northern Africa Disjunction
Within northern Africa, we can find another disjunction between the Atlantic coast of Morocco and the Canary Islands. Pedrola-Monfort and Caujape-Castells (1998) pro-616 del Hoyo and Pedrola-Monfort ALISO posed three hypotheses that could account for the origin of the Canarian species. The first is that their origins lie in two different mainland taxa. The second possibility is that a single mainland taxon could have colonized both groups of islands at different times. The third alternative assumes the existence of a mainland taxon from which one of the Canarian taxa originated (probably A. hierrense, given the geological history of these islands), which in turn would have been the ancestor of the other one. Caujape-Castells et al. (2001) indicated that the origin of the Canary Islands species A. psammophilum and A. hierrense could be explained by a single colonization event from an ancestor related to the mainland A. wyssianum, agreeing with the third hypothesis. In our phylogenetic tree, the Canarian species form a clade with A. wyssianum Hap. 2 (Essaouria population, Morocco) (BS 86%). This population of A. wyssianum possesses a set of changes at DNA sequence level that occur only in the Canarian species, A. psammophilum and A. hierrense. Crossability among individuals of the Canarian species and A. wyssianum Hap. 2 indicate that there is reproductive incompatibility, discounting the likelihood of introgression.
The hypothesis supported by all these data is that the Canarian species originated from a related ancestor with A. wyssianum Hap. 2, the population of Essaouria (Morocco). The inclusion of this new population in our analysis has become an important key to the understanding of the relationship between the Canarian and mainland species, and agrees with the hypothesis proposed by Caujape-Castells et al. (2001).

Taxonomic Implications
Because of the appearance of Bulbocodium, Colchicum, and Merendera in the ingroup with Androcymbium, we discard a monophyletic origin of the genus. Nevertheless it is obvious that tribe Colchiceae (sensu Dahlgren et al. 1985) is monophyletic and to make Androcymbium monophyletic requires only four more steps. We propose the reunification of these four genera. Following the International Code of Botanical Nomenclature (Greuter et al. 1994;sect. 3, art. 11.3), the correct name is the earliest legitimate name, in this case Colchicum.