TY - JOUR
T1 - A large scale analysis of cDNA in Arabidopsis thaliana
T2 - Generation of 12,028 non-redundant expressed sequence tags from normalized and size-selected cDNA libraries
AU - Asamizu, Erika
AU - Nakamura, Yasukazu
AU - Sato, Shusei
AU - Tabata, Satoshi
PY - 2000
Y1 - 2000
N2 - For comprehensive analysis of genes expressed in the model dicotyledonous plant, Arabidopsis thaliana, expressed sequence tags (ESTs) were accumulated. Normalized and size-selected cDNA libraries were constructed from aboveground organs, flower buds, roots, green siliques and liquid-cultured seedlings, respectively, and a total of 14,026 5′-end ESTs and 39,207 3′-end ESTs were obtained. The 3′-end ESTs could be clustered into 12,028 non-redundant groups. Similarity search of the non-redundant ESTs against the public non-redundant protein database indicated that 4816 groups show similarity to genes of known function, 1864 to hypothetical genes, and the remaining 5348 are novel sequences. Gene coverage by the non-redundant ESTs was analyzed using the annotated genomic sequences of approximately 10 Mb on chromosomes 3 and 5. A total of 923 regions were hit by at least one EST, among which only 499 regions were hit by the ESTs deposited in the public database. The result indicates that the EST source generated in this project complements the EST data in the public database and facilitates new gene discovery. The EST sequence data of individual cDNA clones are available at the web site: http://www.kazusa.or.jp/en/plant/arabi/EST/.
AB - For comprehensive analysis of genes expressed in the model dicotyledonous plant, Arabidopsis thaliana, expressed sequence tags (ESTs) were accumulated. Normalized and size-selected cDNA libraries were constructed from aboveground organs, flower buds, roots, green siliques and liquid-cultured seedlings, respectively, and a total of 14,026 5′-end ESTs and 39,207 3′-end ESTs were obtained. The 3′-end ESTs could be clustered into 12,028 non-redundant groups. Similarity search of the non-redundant ESTs against the public non-redundant protein database indicated that 4816 groups show similarity to genes of known function, 1864 to hypothetical genes, and the remaining 5348 are novel sequences. Gene coverage by the non-redundant ESTs was analyzed using the annotated genomic sequences of approximately 10 Mb on chromosomes 3 and 5. A total of 923 regions were hit by at least one EST, among which only 499 regions were hit by the ESTs deposited in the public database. The result indicates that the EST source generated in this project complements the EST data in the public database and facilitates new gene discovery. The EST sequence data of individual cDNA clones are available at the web site: http://www.kazusa.or.jp/en/plant/arabi/EST/.
KW - Arabidopsis thaliana
KW - EST
KW - cDNA
UR - http://www.scopus.com/inward/record.url?scp=0034733243&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0034733243&partnerID=8YFLogxK
U2 - 10.1093/dnares/7.3.175
DO - 10.1093/dnares/7.3.175
M3 - Article
C2 - 10907847
AN - SCOPUS:0034733243
VL - 7
SP - 175
EP - 180
JO - DNA Research
JF - DNA Research
SN - 1340-2838
IS - 3
ER -