Nineteen P1 and TAC clones, which have been precisely localized to the fine physical map of Arabidopsis thaliana chromosome 5, were newly sequenced, and their sequence features were analysed. The total length of the clones sequenced was 1,456,315 bp. Together with the previously reported sequences, the regions of chromosome 5 that have been sequenced to date is now 5,310,105 bp. When the sequences determined in this study were subjected to similarity search against protein and expressed sequence tag (EST) databases and analysis with computer programs for gene modeling, a total of 354 potential protein-coding genes and/or gene segments were identified. The average density of the assigned genes and/or gene segments was one gene per 4,114 bp. Introns were identified in 75% of the potential protein genes, and the average number per gene and the average length of the introns were 3.7 and 194 bp, respectively. These sequence features are essentially identical to those in the previously reported sequences. The numbers of the Arabidopsis ESTs matched to each of the predicted genes have been counted to monitor the transcription level. The sequence data and gene information are available on the World Wide Web database KAOS (the Kazusa Arabidopsis data Opening Site) at http://www.kazusa.or.jp/arabi/.
ASJC Scopus subject areas