A total of 17 P1 and TAC clones each containing a marker(s) specifically mapped on chromosome 5 were isolated from P1 and TAC libraries of the Arabidopsis thaliana Columbia genome, and their nucleotide sequences were determined according to the shot gun-based strategy and precisely located on the physical map of chromosome 5. The total length of the clones sequenced in this study was 1,191,918 bp. As we have previously reported the sequence of 2,662,078 bp by analysis of 33 P1 clones, the total length of the sequences of chromosome 5 determined so far is now 3,853,996 bp. The sequences determined in this study were subjected to similarity search against protein and EST databases and analysis with computer programs for gene modeling, and a total of 310 potential protein-coding genes and/or gene segments with known or predicted functions were identified. The positions of exons which do not show apparent similarity to known genes were also predicted by computer-aided analysis. An average density of the assigned genes and/or gene segments was 1 gene/3,845 bp. Introns were identified in 78% of the potential protein genes, and the average number per gene and the average length of the introns were 3.7 and 185 bp, respectively. The numbers of the Arabidopsis ESTs matched to each of the predicted genes have been counted to monitor the transcription level. The sequence data and gene information are available on the World Wide Web database KAOS (Kazusa Arabidopsis data Opening Site) at http://www.kazusa.or.jp/arabi/.
ASJC Scopus subject areas