Genomic Resources for Asparagales

Enormous genomic resources have been developed for plants in the monocot order Poales; however, it is not known how useful these resources will be for other economically important monocots. Asparagales are a monophyletic order sister to class Commelinanae that carries Poales, and is the second most economically important monocot order. Development of genomic resources for and their application to Asparagales are challenging because of huge nuclear genomes and the relatively long generation times required to develop segregating families. We synthesized a normalized eDNA library of onion (Allium cepa) and produced II ,008 unique expressed sequence tags (ESTs) for comparative genomic analyses of Asparagales and Poales. Alignments of onion ESTs, Poales ESTs, and genomic sequences from rice were used to design oligonucleotide primers amplifying genomic regions from asparagus, garlic, and onion. Sequence analyses of these genomic regions revealed microsatellites, insertions/deletions, and single nucleotide polymorphisms for comparative mapping of rice and Asparagales vegetables. Initial mapping revealed no obvious synteny at the recombinationallevel between onion and rice, indicating that genomic resources developed for Poales may not be applicable to the monocots as a whole. Genomic analyses of Asparagales would greatly benefit from EST sequencing and deep-coverage, large-insert genomic libraries of representative small-genome model species within the "higher" and "lower" Asparagales, such as asparagus and orchid, respectively.


INTRODUCTION
Class Commelinanae and order Asparagales are two major monophyletic groups within the monocots (Chase et al. 1995;Rudall et al. 1997).Phylogenetic estimates based on chloroplast-gene sequences revealed that the commelinoid monocots are sister to Asparagales, and that these groups together are sister to Liliales (Chase et al. 2000;Fay et al. 2000).The most economically important monocots are in Commelinanae, order Poales.The second most economically important monocot order is Asparagales, which includes such valuable plants as agave, aloe, asparagus, garlic, leek, onion, and vanilla.The "higher" Asparagales form a welldefined clade within Asparagales (Chase et al. 1995(Chase et al. , 1996) ) and include Alliaceae (chive, garlic, leek, and onion), Amaryllidaceae (various ornamentals and yucca), and Asparagaceae (asparagus).
The development of genomic resources for Asparagales is challenging due to huge nuclear genomes (Fig. 1), relatively long generation times, and few financial resources.Onion, the most economically important member of Asparagales, is a diploid (2n = 2x = 16) with one of the largest nuclear genomes among all plants (Fig. 1) at 15,290 megabase-pairs Disclaimer: Names are necessary to report factually on available data; however, the US Department of Agriculture (USDA) neither guarantees nor warrants the standard of the product, and the use of the name by USDA implies no approval of the product to the exclusion of others that may also be suitable.
(Mbp) (Arumuganathan and Earle 1991) of DNA per haploid (lC) nucleus (6, 16, and 107-times greater than maize, tomato, and Arabidopsis thaliana (L.) Heynh., respectively).In fact, diploid onion contains as much DNA as hexaploid wheat and on average each onion chromosome carries an amount of DNA equal to 75% of the 1 C content of the maize nuclear genome (Bennett and Smith 1976).Molecular studies have revealed that the GC content of onion DNA is 32%, the lowest known for any angiosperm (Kirk et al. 1970;Stack and Comings 1979;Matassi et al. 1989).Density-gradient centrifugation revealed no significant satellite DNA bands, except for a 375-bp telomeric sequence (Barnes et al. 1985).Ty1/copia-like retrotransposons are present throughout the bulb-onion genome and concentrated in terminal heterochromatic regions (Leeton and Smyth 1993;Pearce et al. 1996).Stack and Comings (1979) used C 0 t reassociation kinetics to demonstrate that the onion genome consists of middle-repetitive sequences occurring in short-period interspersions among single-copy regions (Stack and Comings 1979).The huge nuclear genome of onion is not likely due to a recent polyploidization event.Jones and Rees (1968) and Ranjekar et al. (1978) proposed that intrachromosomal duplications contributed to increased chromosome sizes in onion.The large nuclear genome of onion has no effect on the number of markers required to produce detailed genetic maps, which have been developed using intra- (King et al. l998a) and interspecific (van Heusden et al. 2000) crosses.
Garlic, the second most economically important Asparagales, is a diploid (2n = 2x = 16) plant and has a nuclear geno me abo ut 7% small er than on ion (Ori et a l.1998).Because of its ob li gate vegetative production, no genetic studies have been reported fo r garlic.Occasionally, garlic plants can be induced to produce flower stalks w ith a few fertile flowers.Recently, true seed production has been realized in garl ic (Etoh 1983(Etoh , 1986;;Pooler and Simon 1994) and genetically defined families produced (Jenderek and Hannan 2000).These fami lies offer the first opportunity in hi story to develop a genetic map of garlic, assess synteny with o ni o n and other plants, a nd open the door for genetic improvement of this important plant.
Asparagus is a diploid (2n = 2x = 20), dioecious plant with a relatively small DNA content at 1800 Mbp per LC Table I.Numbers of BAC clones of 150 kb required for complete coverage of the nuclear genome at 99% probability fo r represe ntative plants in the order Asparaga les.Minimum number of BAC c lones were calculated usin g the formu la on page 9.3 of Sambrook et a t. ( 1989).

Genome size
No  (Galli et al. 1993).Much effort has gone into genetic mapping of asparagus usi ng a variety of segregating fam ilies (Resti vo et al. 1995;Ji ang et al. 1997 ;Spada et al. 1998).
A bacterial artificial chromosome (BAC) tibrary of asparagus covering ap proximately 2.6X of the genome has been synthes ized (Nitz et al. 2002).There is variation for genome sizes among asparagus species, due in part to po lyploidy (Staj ner et al. 2002).Asparagus (A. officinalis) has twi ce as much nuclear DNA as the South African asparag us fern (A.plumosus Baker), even though both plants have the same chromosome number (2n = 2x = 20).It is presently unclear if the larger asparagus geno me is due to accumulation of repetitive DNA or a genome doubting followed by chromosome fusions.
The application of geno mic techno logies to Asparagales wou ld be greatly e nhanced by generation of expressed seq ue nce tags (ESTs), productio n of hi gh-de nsity genetic maps based o n ho mologous seque nces from divergent species, and the assembly of geno mi c contigs covering large reg ions of the nuclear DNA.These resources would allow breeders and genetic ists to effic ie ntl y assign important phenotypes to chromosomes by segregati o n ana lyses, ide ntify synte nous genomic regions in small-genome model spec ies, a nd sequence to reveal candidate genes.Asparagus has one of the sma llest nuc lear geno mes known a mong the hi gher Asparagales (Fig. 1); however, thi s geno me is still three times larger th an ri ce and about half that of maize.BAC libraries of asparagus would require 38,000 c lones of 130 kb for a LX coverage at a 99% confidence level, many fewer than for other members of Asparagales (Table L).
Enormo us numbers of expressed seq ue nce tags and deep coverage genom ic libraries have been produced for members of Poales, including barley, rice, maize, sorghum, sugarcane, and wheat.Genetic linkage conservation (synte ny) am ong Poales is widely recognized (Moore 1995 ;Devos andGale 1997, 2000 of the rice nuclear genome as model system for Poaceae (Gale and Devos 1998).It is not clear how representative Poales as a group, and rice as an individual spec ies , are fo r the monocots as a whole or how w idely geno mic resources deve loped fo r Poales will be applicabl e to other monocots with large, complex genomes.To address th ese questions, we generated 11 ,008 unique o ni o n ESTs and are completing comparati ve mapping of asparagus, garli c, o ni o n, and rice.

MATER IALS AND METHODS
Synthesis of and sequencing from a norma lized complem entary (c) DNA librar y of onion has been previously described by Kuhl et a!.(2004) and its characteri sti cs are repeated here for conve nience on ly.Tissue was harvested from immature bulbs of oni o n cultivar 'Red Creole,' the callus of an unkn own on io n c ulti var, and roots of cultivars 'Ebano ' and 'Texas Legend.'All ti ssues were immedi ately frozen in liquid nitrogen after harvest and stored at -80°C un ti l RNA extraction by the TrizoVMessagemaker system (Invitrogen, Carlsbad, Californ ia, USA) and oligo-dT chromatography (Sambrook et al. 1989).Equal molar amounts of po lyA + messenger (m) RNA from the three ti ssues were combined and cDNAs synthesized after priming with o li go-dT.cDNAs were size selected to enrich for molecules > 1.0 kb and subjected to a proprietar y normalization process (Invitrogen).cDNAs were directionally c loned into the pCMVSport6.1ccdb(Invitrogen) vector.A sample of cDNAs from the normalized and no n-normalized libraries were plated, transferred to colony lift membranes, and hy bridi zed with a 13tubulin clone from onion to assess the efficacy of the normalizati o n step.
A total of 20,000 random clones were subjected to singlepass sequencing reactions from the 5 '-e nd .Base calling, vee-tor trimming, and the removal of low-q ua li ty bases was performed (Cbuo and Holmes 200 1).ESTs were assembl ed into tentative consensus (TC) groups using c lusterin g tools as described by Pertea et al. (2003).The o ni on TCs and si ngletons were searched against the ri ce gene index at The Institute for Genomic Researc h (TIGR) using BLAST (Yuan et al. 2000) and requir.ing > 70% ide ntity extending to within 30 bp of the e nds.Matching rice ESTs were then searched again st rice BAC seq ue nces requiring matches > 95 % identity over 80% of EST lengths.The pos itions of these accessio ns o n the rice genetic map were identified based on the alignments of ri ce marker and BAC sequences.Onion ESTs were selected that showed hi g h s imilarities (e < -I 0) to single position s o n ri ce chromosomes.The selected on ion ESTs were a lig ned on corresponding rice BACs and external and nested primers were designed based o n seque nce conser va ti o n between o ni o n and rice.Onion genomic regions were am plified f rom inbred lines Alisa C raig (AC) 43 and Brigham Yellow Globe (BYG) 15-23 .Single amplicons were excised fro m agarose gels, cloned into a pl as mid vector, and seq uenced (Lilly and Havey 200 I).Single nuc leotide polymorph isms, insertions/de letion s (inde ls), o r polymorphic restriction-enzyme sites were revealed after seq uence a li gnments and scored using amplicons fro m the segregating onio n fa mil y from BYG 15-23 by AC43 (Ki ng et al. 1998a).

RESULTS AND DISC USS ION
The normali zed oni o n eDNA library consisted of 3.4 X 10 7 recombinants with an average in sert size of 1.6 kb.The normalization process was successful and red uced 70-fo ld the frequency of 13-tubu lin cDNAs.We completed 20,000 sing le-pass sequenc ing reactions on random c lones from this library to yie ld 18,484 hi gh-quality ESTs of at least 200 bp in size.These 18,484 sequences assembled into 3690 tentative consensus (TC) sequences and 7318 singletons, yielding 11,008 unique transcripts.These onion sequences represent the first large set of publicly available ESTs from a monocot outside of Poales.Putative functions were assigned to the I I ,008 unique onion sequences by searching against manually annotated gene ontology (GO) proteins from model species.A total of 2608 sequences could be assigned to GO functional annotations (Fig. 2).The most common annotation class among onion ESTs was metabolism ( 19% ), as previously observed in tomato (Lycopersicon esculentum L.; Van der Hoeven et al. 2002) and rice (Kikuchi et al. 2003).These ESTs were used to develop the onion gene index (http://www.tigr.orgltigr-scripts/tgi!T-index.cgi ?species= onion), the first for a monocot outside of class Commelinanae.This is a searchable gene index with assignment of onion ESTs to likely gene ontologies and metabolic pathways.
Onion genomic amplicons from BYG15-23 and AC43 revealed primarily single nucleotide polymorphisms (SNPs) with relatively few microsatellites or indels.Transitions were the most common class of SNPs at approximately 30% for TC and AG polymorphisms; transversions were less common at approximately 10% each for GT, AC, CG, and AT polymorphisms.Because of the residual heterozygosity in onion inbreds (King et al. 1998b ), some SNPs between parental inbreds do not segregate in our onion families.
A useful application of Poales genomic resources would be synteny near chromosome regions carrying unique, economically important genes in Asparagales.A case in point is cytoplasmic-genic male sterility (CMS) used to produce hybrid-onion seed.The onion has perfect flowers and the production of hybrid seed requires systems of male sterility.For the most widely used source of CMS in onion, male sterility is conditioned by the interaction of sterile (S) cytoplasm and the homozygous recessive genotype at a single nuclear male-fertility restoration locus (Ms) (Jones and Clarke 1943).Maintainer lines used to seed propagate malesterile lines possess normal fertile (N) and are homozygous recessive at the Ms locus (Jones and Davis 1944).Because onion is a biennial, four to eight years are required to establish if maintainer lines are to be extracted from a population or family (Gokce et al. 2002).This requires great expenditures of time and money and hybrid-onion development would significantly benefit from molecular markers that establish the nuclear genotype of onion plants, significantly reducing the number of testcrosses required to identify maintainer plants.We previously identified molecular markers flanking Ms at 0.9 and 8.6 centiMorgans (eM) (Fig. 3; Gokce et al. 2002); however, these molecular markers were in linkage equilibrium with Ms in open-pollinated onion populations (Gokce and Havey 2002).We require more closely linked markers flanking Ms to make practical marker-facilitated selection of maintainer lines from open-pollinated populations.Instead of relying on a random approach to identify closely linked markers, we would prefer to exploit the genomic resources developed for Poales, using rice as the model.The onion eDNA AOB272 reveals a molecular marker tightly linked to Ms at 0.9 eM (Fig. 3).AOB272 shows a single, highly significant (e < -97) hit to one position in the rice genome on chromosome #3 at 139.8 eM.Sequence comparisons among molecular markers across a 10 eM region close to the Ms locus of onion revealed no obvious synteny on the recombinational scale with rice (Fig. 3).In order to test if microsynteny existed between onion and rice, we are mapping onion ESTs showing highly significant similarities to single positions in the rice genome physically linked to position 139.8 on chromosome #3.If microsynteny existed, these onion ESTs should show tight linkage to the chromosome region carrying AOB272 and Ms. Allele-specific markers could then be developed, allowing the seed industry to establish genotypes at the Ms locus without labor-intensive and time-consuming testcrosses.
In conclusion, initial comparative mapping of rice and onion revealed no obvious synteny between Asparagales and Poales at the recombinational level (Fig. 3).It is possible that shorter regions of synteny may exist that could be exploited for identification and cloning of candidate genes in Asparagales.However, our initial results indicate that Poales may not be representative of all monocots requiring that genomic resources be independently developed for major monophyletic lineages within the monocots.Specific smallgenome species within the higher and lower Asparagales, such as asparagus and orchid, respectively, represent obvious model plants for the development of deep coverage genomic libraries and comparative mapping.Genetic and genomic analyses of these model Asparagales, such as comparative mapping and HAC-sequence analyses of Asparagales to estimate the level of synteny on the recombination and sequence levels, will allow for direct comparisons among Asparagales and Poales and will reveal the most appropriate genomic model for important monophyletic lineages outside of Poales.Comparative genomic analyses among Asparagales, Poales, and other monocot orders will provide important insights about genome evolution and diversification for these important plants.

Fig
Fig. I. -Rel ative amounts of nucl ear DNA (megabase-pa irs per I C nucleus) for major culti vated monocots (Arumuga nathan and Earle 1991 ).
Fig. 3.-Comparative locations of expressed sequence tags on onion linkage group B (left) with the most highly similar (e < -20) genomic locations in rice.