Phylogenetics of the Borage Family: Delimiting Boraginales and Assessing Closest Relatives

The placement of Boraginales, and relationships within the family have remained elusive in modern, broad phylogenetic studies. In order to assess the phylogeny of Boraginales, and specifically to test the sister lineage of the order, a data matrix of the chloroplast markers rbcL, ndhF, and trnL-trnF was assembled fromGenBank and de novo sequences (representing 132 newGenBank accessions). Phylogenies inferred using Maximum Likelihood and Bayesian frameworks resulted in identical topologies. Tests for alternative topologies were used to assess whether any of the candidates for sister (Solanales, Gentianales, Lamiales, or Vahlia) to Boraginales could be ruled out with this dataset. Gentianales was eliminated as the possible closest relative to Boraginales. Additionally, SH tests were used to test topological results within Boraginales: monophyly of Hydrophyllaceae cannot be rejected and paraphyly of Ehretiaceae with respect to the parasitic Lennoaceae is supported. Taxonomic implications are discussed within the context of these phylogenetic results.


INTRODUCTION
Boraginales, as currently recognized by taxonomic experts, includes approximately 2500 species (Mabberley 2008;Weigend et al. 2014). The family has a worldwide distribution, with many species occupying seasonally dry and xeric habitats in both tropical and temperate biomes (Gottschling et al. 2001;Weigend et al. 2014). The name derives from the Latin burra, which references the often hirsute or hispid leaves of plants within the family (Simpson 2010). Inflorescences are generally a monochasial, scorpioid cyme or composed of cymose units (Buys and Hilger 2003) and flowers are bisexual, actinomorphic, and pentamerous. Fruit morphology has historically been important in circumscribing higher taxa such as genera (de Candolle 1846;Cohen 2013), as well as to delimit lower-level relationships including species boundaries (Hasenstab-Lehman and Simpson 2012). Several taxonomic and systematic questions still remain unanswered for this group, and evidence from DNA has both clarified and confounded aspects of relationships. Despite recent attempts to achieve resolution, confident identification of the closest phylogenetic neighbors to Boraginaceae have remained elusive (Albach et al. 2001;Bremer et al. 2002;Moore et al. 2010;Soltis et al. 2011;Weigend et al. 2014;Refulio-Rodriguez and Olmstead 2014).
De Candolle (1846) was the first to treat Boraginaceae as comprising four subfamilies: Boraginoideae, Heliotropioideae, Ehretioideae, and Cordioideae. This circumscription of Boraginaceae ("sensu DC.") remained unchanged for nearly 150 years, with new species described and assigned to each of the subfamilies by specialists in the group (e.g., Johnston 1927). Boraginaceae sensu DC. have been placed in a variety of or-C 2017, The Author(s), CC-BY. This open access article is distributed under a Creative Commons Attribution License, which allows unrestricted use, distribution, and reproduction in any medium, provided that the original author(s) and source are credited. Articles can be downloaded at http://scholarship.claremont.edu/aliso/. ders (Table 1) under several classification systems. Takhtajan's (1987) classification differed from those of contemporaries because it recognized the order Boraginales Juss. ex Bercht. & J. Presl, which comprised Boraginaceae sensu DC., Hoplestigmataceae Gilg, Hydrophyllaceae R. Br., Lennoaceae Solms and Tetrachondraceae Skottsb. ex Wettst. Subsequent molecular evidence indicated that Tetrachondraceae are instead embedded in Lamiales, but the positions of the other families of Boraginales were unresolved (Wagstaff et al. 2000). In the most recent angiosperm-wide classification, the circumscription of Boraginaceae was expanded to include two previously distinct families, Hydrophyllaceae and Lennoaceae (APG III 2009), while taxonomic experts recognize several smaller families with a distinct order (Cohen 2013;Weigend et al. 2013Weigend et al. , 2014. The phylogenetic position of Boraginales remains uncertain with respect to Lamiales, Solanales, and Gentianales, but the order has been placed both phylogenetically and taxonomically in Lamiidae (APG III 2009;Refulio-Rodriguez and Olmstead 2014).
APG III (2009) proposed an expanded familial concept of Boraginaceae in part based on molecular evidence clearly demonstrating that Boraginaceae sensu DC. and Hydrophyllaceae are paraphyletic with respect to each other (Olmstead et al. 1993;Ferguson 1999;Gottschling et al. 2001;Moore and Jansen 2006;Nazaire and Hufford 2012;Cohen 2013;Weigend et al. 2013Weigend et al. , 2014Refulio-Rodriguez and Olmstead 2014). Molecular evidence also supports the removal of several genera from de Candolle's Hydrophyllaceae to different orders and families. For example, Hydrolea L. is now placed in Solanales (Soltis et al. 2000) and Pteleocarpa Oliv. is now considered to be in Gentianales (Mabberley 2008). Codon L., a small African genus, is of particular note because it is sister to all taxa of de Candolle's Boraginaceae (Luebert and Wen 2008;Nazaire and Hufford 2012;Weigend et al. 2014). These results have contributed to taxonomic discord. Some authors place Codon in its own family, Codonaceae (Weigend and Hilger 2010;Weigend et al. 2013Weigend et al. , 2014, whereas others place it in its own tribe within Boraginaceae (Nazaire and Hufford 2012). Table 1 compares APG III (2009) proposed an expanded Boraginaceae for the additional reason that molecular evidence places previously distinct Lennoaceae within Boraginaceae sensu DC. Lennoaceae as treated in the most recent monograph of the group, consists of two genera and four species (Yatskievych et al. 1986). This group of holoparasites has a disjunct distribution, occurring in southwestern North America and Colombia (Yatskievych et al. 1986). Classification of Lennoaceae has been notoriously difficult using morphology or molecular evidence because of the highly autapomorphic morphology of these holoparasites, and sequencing of chloroplast loci often yields what appear to be pseudogenes (pers. obs.). As shown in Table 1, the group has been placed in Lamiales (Cronquist 1988) and Solanales (Thorne 1992), whereas specialists on parasitic plants instead suspected an affiliation with borages (Yatskievych et al. 1986). Indeed, it is now clear that these parasites are most closely related to recently derived woody borages (Gottschling et al. 2001(Gottschling et al. , 2004(Gottschling et al. , 2005Moore and Jansen 2006;Nazaire and Hufford 2012;Gottschling et al. 2014). However, their exact phylogenetic position varies in these studies, and the current study seeks to resolve their position within Boraginales.
As reflected in Table 1, both the taxonomy and classification of Boraginales and relatives have been in flux for the last twenty years because of varied phylogenetic results and differing conclusions as to how best to treat the group based on inferred evolutionary history. Gottschling et al. (2001) were the first to conduct a molecular phylogenetic study of Boraginales and to reveal the major lineages. Based on ITS1 sequences, the authors proposed a classification in which order Boraginales was adopted, each of de Candolle's subfamilies of Boraginaceae was elevated to the rank of family, and Hydrophyllaceae and Lennoaceae were recognized. Relationships inferred from ITS1 were (Boraginaceae (Hydrophyllaceae (Heliotropiaceae Schrad. (Cordiaceae R. Br. ex Dumort. (Ehretiaceae C. Mart. ex Lindl. + Lennoaceae))))). Based on their taxon and nucleotide sampling, many of the constituent lineages received strong support as monophyletic. However, much of the backbone of the tree was either unresolved or not strongly supported, and Codon was not sampled. Gottschling et al. (2001) began to reveal the major lineages within Boraginales, but outgroup sampling was inadequate to assess placement of the group among asterids.
Following Gottschling et al. (2001), other studies recovered a similar topology for Boraginales using different molecular markers, although taxon sampling was sparser in most (Moore and Jansen 2006;Cohen 2013;Weigend et al. 2013). None of these studies explored the closest relative to Boraginales. A study by Nazaire and Hufford (2012) is of note because of important novel findings, and their taxonomic treatment highlighted the continued debate about how to best classify Boraginales. The study generated new sequence data, supplemented by Genbank data, and assembled a data set that included nrITS and several chloroplast loci and a broad sampling of Boraginales. This broad taxon sample included Codon, which was resolved as sister to what they treated as subfamily Boraginoideae. Contra Weigend and Hilger (2010), Nazaire and Hufford (2012) advocated recognizing Codon as a tribe of Boraginoideae rather than as a separate, monogeneric family. The authors supported the APG III (2009) classification, adopting a broadly circumscribed Boraginaceae with constituent clades treated as subfamilies (Table 1). However, none of the foregoing studies sampled broadly enough to identify the closest relatives of Boraginales.
Generally, as reviewed above, classification of borages and relatives has suffered from instability especially over the past 20 years. The biggest controversy arises from whether to treat the clade at the ordinal level with constituent clades as families or to treat the entire clade as a family. Many of the taxonomic changes may have been made prematurely, based on phylogenetic studies with poor taxon sampling or poorly resolved phylogenies. This problem is exacerbated by the uncertain position of Boraginales among lamiids.
In this study, I sought to (1) test relationships among Boraginales reported in previous studies and (2) determine the closest relatives of Boraginales using data from select chloroplast genic regions. Taxonomy is discussed in light of the phylogenetic analyses.

Plant Samples
Samples were field collected or obtained from dried herbarium specimens at the Rancho Santa Ana Botanic Garden (RSA-POM) herbarium. All specimen identifications were verified before inclusion in the study.

Taxon Sampling
DNA sequence data were generated for 41 taxa, representing all major lineages identified to date within Boraginales except Wellstedia Bafl. and Hoplestigma Pierre; material of these two genera was not available. Within Boraginales, taxon sampling included major lineages found in previous molecular studies as follows: Boraginaceae , Hydrophyllaceae (Ferguson et al. 1999), Heliotropiaceae (Diane et al. 2002), Ehretiaceae (Gottschling et al. 2004), and Cordiaceae (Gottschling et al. 2005). A combination of de novo sequencing (Metteniusa, Oncotheca) and GenBank data was used to achieve wide sampling across potential sister lineages among lamiids. The goal was to sample taxa from all lineages that have been placed sister to or part of a polytomy with Boraginales in previous molecular studies, including Solanales, Lamiales, Gentianales, and Vahliaceae. In some cases, terminals are combinations of two or more species from the same genus. Although this assumes that each of these genera is monophyletic, this is unlikely to be problematic given the goals of this study. Trees were rooted with Campanula, which repeatedly has been shown to be well outside the lamiids (Albach et al. 2001;Bremer et al. 2002;Moore et al. 2010;Soltis et al. 2011). Table 2 provides a complete list of taxa, voucher specimens, and genic regions represented in the data matrix, as well as GenBank accession numbers for de novo sequences generated for this study. Appendix 1 provides species and numbers for accessions downloaded from GenBank.

Choice of Genic Regions
Chloroplast loci were selected for phylogeny reconstruction based on two criteria: (1) availability either by de novo sequencing or from GenBank for all taxa sampled including the wide range of putative relatives of Boraginales and (2) rates of evolution ranging from relatively fast to slow. The latter is important given that this study seeks to infer relationships both at the family and ordinal levels: rbcL evolves slowly and is easily aligned across all taxa including parasitic plants of Pholisma Nutt. ex. Hook.; ndhF and especially the trnL-F spacer region evolve more rapidly and have a history of resolving species and generic level relationships with high statistical support (Shaw et al. 2005).

DNA Isolation and Sequencing
DNA was isolated using a three-day, modified version of the CTAB (cetyl trimethyl ammonium bromide) protocol (Doyle and Doyle 1987;Friar 2005). Amplifications were done in 25 μl volumes containing: 2.5 μl 10× standard Mg-free Buffer, 1.25 μl 1.5 μM MgCl 2 , 0.125 μl 5000 U/ml TAQ polymerase, 1.2 μl 10 μM forward and reverse primers, 1.25 μl 200 μM dNTPs, 16.375 μl H 2 O, and 1.0 μl total of 1-10 μg/ml genomic DNA. Amplifications were carried out on an Applied Biosystems 2720 thermal cycler, using the following conditions: 5 min 95 • C, 35 cycles of 1 min 95 • C, 40 seconds 53 • C, 1 min 72 • C, with a final extension of 7 min at 72 • C. The PCR amplicons were precipitated with 20% polyethylene glycol 8000 (PEG) in 2.5 M NaCl using equal volumes of PEG to PCR product; the mixture was incubated at 37 • C for 15 minutes. DNA was pelleted by centrifugation for 15 min at 14,000 rpm. The pellet was washed with 80% ethanol. Sequencing was done on an ABI 3130xl at Rancho Santa Ana Botanic Garden using the same primers as for amplification.

Sequence Editing and Alignment
Sequences were edited using Sequencher R version 5.2. Annotated sequences were deposited in GenBank (see Appendix 1 for accession numbers). The initial alignment was done in ClustalW (Thompson et al. 1997) and the final alignment was achieved manually in MacClade 4.08 (Maddison and Maddison 2008).

Maximum Likelihood Analysis
Phylogenetic inference using a maximum likelihood optimality criterion (ML; Felsenstein 1981) was implemented utilizing the RAxML 7.2.8 (Stamatakis 2006) plugin in Geneious version 7.1 (Kearse et al. 2012). The GTR + I + G model of nucleotide evolution was selected for both ndhF and trnL-F, and Table 2. List of taxa for newly generated data with voucher information and GenBank accession numbers. All vouchers are deposited at RSABG-POM Herbarium.

Taxon Name
Voucher information rbcL ndhF trnL-trnF TMV + I + G was selected for rbcL using the Akaike information criterion (AIC; Akaike 1974; Posada and Crandall 2001) implemented in jModeltest version 3.7 (Posada 2008). Statistical support was assessed with a maximum likelihood bootstrap (bs) analysis implemented in RAxML (Stamatakis 2006), with bs support values estimated from 10,000 replicates.

Bayesian Inference
Analyses were done using MrBayes version 3.1.2 (Huelsenbeck and Ronquist 2001) implemented on the CIPRES portal .) Models for each dataset were determined as in the maximum likelihood analyses. The loci were run as separate process partitions in the mcmc algorithm. All Bayesian analyses ran for 10,000,000 generations, with sampling every 100 generations. Consensus trees were produced from trees sampled after the standard deviation of split frequencies reached a value of 0.01, with all trees prior to this discarded as burn-in trees. Posterior probabilities (pp) were calculated on post burnin trees.

Alternative Hypothesis Testing
Paraphyly of Hydrophyllaceae and Ehretiaceae (see below) was unexpected and called for testing of whether monophyly of either family can be refuted by the data. To accomplish this, the Shimodaira-Hasegawa test (Shimodaira and Hasegawa 1999) was implemented in PAUP* (Swofford 2003) to compare tree topologies yielded by the ML and Bayesian analyses to trees constraining these groups to monophyly. The best-fit model for each locus as determined above under the AIC criterion was used. Constrained topologies were Ehretiaceae monophyletic with respect to Pholisma, and Hydrophyllaceae I and II forming a single clade. Additionally, to investigate the sister taxon to Boraginales, constrained topologies placing Vahlia Thunb., Lamiales, Gentianales, and Solanales sister to Boraginales were analyzed.

Phylogenetic Inference: Relationships within Boraginales
Topologies from the likelihood and Bayesian analyses were identical. The maximum likelihood cladogram is presented in Fig. 1 with bs and pp support values. Boraginales formed a strongly supported clade (bs = 100, pp = 1.00). Boraginaceae + Codon were recovered as sister with strong support (bs = 100, pp = 1.00) and as the earliest diverging clade within the family; together they were sister to a clade comprising the remaining lineages. The majority of sampled hydrophylls formed a grade, albeit without strong support for the branching order (bs = 38, pp = 0.49). One clade of Hydrophyllaceae formed a strongly supported clade (bs = 91, pp = 1.00) composed of Phacelia Juss., Draperia Torr., Eucrypta Nutt., and Hydrophyllum L. Another clade of Hydrophyllaceae was strongly supported as monophyletic, and comprised the annual Nama L. and longlived shrub genera Eriodictyon Benth. and Wigandia Kunth. The remaining three lineages, Heliotropiaceae, Cordiaceae, and Ehretiaceae, formed a clade (bs = 40, pp = 0.96), but their interrelationships were not resolved with certainty by this dataset. A topology of Heliotropiaceae sister to Cordiaceae + Ehretiaceae ( Fig. 1) was inferred consistently by both analytical approaches, but never with strong statistical support. The monophyly of Heliotropiaceae was strongly supported (bs = 100, pp = 1.00), as was that of Cordiaceae (bs = 99, pp = 1.00). Ehretiaceae were monophyletic with inclusion of the parasitic Pholisma (Lennoaceae) but with strong support only from pp (bs = 68, pp = 0.95).

Phylogenetic Inference: Relationships to Other Lamiids
Vahlia is recovered as sister to Boraginaceae, but this relationship lacks statistical support (bs = 36, pp = 0.51). Boraginaceae + Vahlia are placed in Lamiidae with strong support (bs = 100, pp = 1.00), along with Solanales, Gentianales, and Lamiales. Each of these orders was recovered as monophyletic with strong support. Lamiales was inferred as the first to diverge, followed by Gentianales (pp = 0.73). Solanales was recovered as sister to the Vahlia + Boraginales clade but not with strong support (bs = 70, pp = 0.87). The remaining taxa formed an unresolved clade: Garryales, Icacinaceae s.l., Metteniusa H. Karst., and Oncotheca Baill. Note that Metteniusa + Oncotheca resolved as sister taxa with especially strong support from bs (bs = 81, pp = 0.99), a novel result.

Alternative Hypothesis Testing
SH tests implemented in PAUP* returned mixed results for several of the topologies tested. Constraining Hydrophyllaceae as a clade, monophyly could not be rejected by the data (p = 0.093), indicating that both hypotheses (monophyly and paraphyly) remain viable. The alternative hypothesis that forced Ehretiaceae to be monophyletic to the exclusion of Lennoaceae was rejected (p = 0.002). With respect to other Lamiids, Gentianales were rejected as sister to Boraginales (p = 0.0003), but sister relationships involving Vahlia (p = 0.065), Lamiales (p = 0.059) and Solanales (p = 0.072) could not be rejected. Thus, all of these taxa remain candidates as the closest living relative of Boraginales.

Phylogeny of Boraginales
This study advances our knowledge of relationships among lineages of Boraginales in several ways. The first clade to diverge consists of Boraginaceae + Codon. This is consistent with two recently published studies (Nazaire and Hufford 2012;Weigend et al. 2014). Wellstedia, which was not sampled for the present study, belongs in this clade as well based on previous analyses of chloroplast sequences . Other studies of Boraginales focused on lower level relationships (L°angström and Chase 2002;Nazaire and Hufford 2012;Weigend et al. 2013;Cohen 2013) or on relationships among species and genera within tribes (Luebert and Wen 2008;Cohen and Davis 2012).
The second major clade within Boraginales consists of four lineages, here referred to as Hydrophyllaceae, Heliotropiaceae, Cordiaceae, and Ehretiaceae. To the degree that taxon sampling is comparable, paraphyly of Hydrophyllaceae is consistent with previous phylogenetic analyses , but this split is not strongly supported statistically. Further, alternative hypothesis testing demonstrates that these data cannot refute monophyly of Hydrophyllaceae. This phylogenetic uncertainty illustrates the need for both additional taxon sampling and especially more sequence data. Taxonomic changes cannot be recommended at this time, especially since morphological synapomorphies of the clades have yet to be identified. Placement of taxa in Hydrophyllaceae by de Candolle (1846) and subsequent authors seems to have been largely based on symplesiomorphies. A well-resolved and supported phylogeny will allow specialists in the group to identify morphological apomorphies that clearly delimit clades and will pave the way for a stable classification.
Cordiaceae and Ehretiaceae resolve as sister clades, but with poor support (bs = 38, pp = 0.50), consistent with other studies (i.e., Nazaire and Hufford 2012;Weigend et al. 2014). Cordia L. is monophyletic and sister to the remaining Cordiaceae, which contrasts with Gottschling et al.'s (2005) study wherein Patagonula L. is embedded within Cordia. Additional research is needed to fully elucidate relationships within Cordiaceae.
Ehretiaceae is rendered paraphyletic by inclusion of three Pholisma species sampled from Lennoaceae. The present analysis places Tiquilia Pers. sister to Pholisma with weak support (pp = 0.65), but together with the remaining Ehretiaceae sampled, form a clade (pp = 0.95) indicating that the parasitic clade should be subsumed into Ehretiaceae in order to establish monophyletic taxa. Further basis for the placement of Pholisma within Ehretiaceae comes from the SH test which rejected monophyly of Ehretiaceae (p = 0.002), given the data. This contrasts with Weigend et al. (2014) wherein Pholisma forms a polytomy with several other taxa from Ehretiaceae, and is still not placed with certainty in Gottschling et al. 2014. The alternative-splitting Ehretiaceae into smaller units-is left up to taxonomic specialists in the group.

Relationships with Other Lamiids
Vahlia is recovered as sister to Boraginales, consistent with Nazaire and Hufford (2012) and Bremer et al. (2002). This finding contrasts with Weigend et al. (2014) wherein Vahlia is sister to Lamiales. In the present study, Vahlia + Boraginales are sister to Solanales, followed successively by Gentianales, then Lamiales. However, none of these relationships is statistically supported. The Shimodaira-Hasegawa tests eliminate Gentianales as sister to Boraginales but retain Vahlia (p = 0.065), Solanales (p = 0.072), or Lamiales (p = 0.059) in contention. Other studies such as Soltis et al. (2011), based on 17 genes from across the mitochondrion, ribosomal cistron, and plastome, resolved relationships in Lamiidae as (Vahliaceae ((Boraginaceae + Lamiales) (Gentianales + Solanales))), but also without strong support. With the advent of high-throughput sequencing these relationships may be clarified by sampling of a number of independent loci from the nucleus.

Notes on Classification
The advantages of creating classifications that reflect phylogeny and recognize monophyletic taxa have been well documented. Once phylogenetic patterns and monophyletic groups are identified and strongly supported, resulting classifications should remain stable unless new data, whether molecular or morphological, alter our understanding of phylogenetic patterns.
Over the last 15 years, APG has consistently recognized a broad Boraginaceae, inclusive of Hydrophyllaceae and Lennoaceae. Recent textbooks and field guides reflect this change. However, recognition of Boraginales at the ordinal level with as many as nine families (some monogeneric) has been advocated by taxonomic specialists in the group (Gottschling et al. 2001;Cohen 2013;Weigend et al. 2013Weigend et al. , 2014.
The constituent genera and species of Hydrophyllaceae remain incompletely known, and it is still unclear whether hydrophylls represent one lineage or two. Considerably expanded taxon and nucleotide sampling will be required to clearly delineate clade membership.
Additionally, relationships as reconstructed here and by Nazaire and Hufford (2012) and Gottschling et al. (2014) show that the parasitic taxa placed in de Candolle's Lennoaceae are embedded within Ehretiaceae. A classification recognizing Lennoaceae is inconsistent with phylogenetic relationships and should be revised to either split groups within Ehretiaceae further or to subsume the parasites into a taxonomic group inclusive of their autotrophic relatives. It would be best to sample Lennoa before formally making taxonomic changes. As well, it would be wise to test the placement of the parasitic plants shown here with data from other genomes, as chloroplast loci may be strongly impacted by the plant's life history strategy (Krause 2008). The chloroplasts of holoparasites may be subject to high rates of evolution and mutations resulting in loss of gene function, evolutionary patterns that can be misleading when these loci are used for phylogenetic inference (Bromham et al. 2013). For stability in naming systems, morphological characters that have been discussed by other authors, and recognition of taxonomy advocated by specialists in Boraginales, this study supports recognition of Boraginales sensu Weigend et al. (2014). CONCLUSIONS It is likely that the chloroplast genome has a different evolutionary history than that of nrITS, as suggested by the different branching order among families of Boraginales in this study as compared to Gottschling et al. (2001), as well as by the close affiliation of Pholisma with Tiquilia within Ehretiaceae. More sampling of the nuclear and possibly the mitochondrial genome may reveal alternate hypotheses of relationships within Boraginales. Circumscribing Hydrophyllaceae as monophyletic or forming two distinct clades is a matter for future studies by other authors ) as data analyzed here cannot refute either hypothesis of relationships. Until these relationships are more fully investigated with additional sequence data and taxa it would be premature to make drastic changes to the classification.
The closest relatives of Boraginales remain ambiguous. Again, additional sampling from the other genomes using highthroughput sequencing may be necessary to resolve this relationship. However, given that whole-genome sampling did not resolve relationships among lamiids with strong support (Moore et al. 2010), these relationships may remain elusive, perhaps reflecting rapid evolutionary divergence within Lamiidae.