Prediction of the coding sequences of unidentified human genes. I. The coding sequences of 40 new genes (KIAA0001-KIAA0040) deduced by analysis of randomly sampled cDNA clones from human immature myeloid cell line KG-1

Nobuo Nomura, Nobuyuki Miyajima, Takashi Sazuka, Ayako Tanaka, Yutaka Kawarabayasi, Shusei Sato, Takahiro Nagase, Naohiko Seki, Ken Ichi Ishikawa, Satoshi Tabata

Research output: Contribution to journalArticle

263 Citations (Scopus)

Abstract

We established a protocol for the prediction of the coding sequences of unidentified human genes based on the double selection and sequence analysis of cDNA clones with inserts carrying unreported 5′-terminal sequences and with insert sizes corresponding to nearly full-length transcripts. By applying the protocol, cDNA clones with inserts longer than 2 kb were isolated from a cDNA library of human immature myeloid cell line KG-1, and the coding sequences of 40 new genes were predicted. A computer search of the sequences indicated that 20 genes contained sequences similar to known genes in the GenBank/EMBL databases. The sequences of the remaining 20 genes were entirely new, and characteristic protein motifs or domains were identified in 32 genes. Other sequence features noted were that the coding sequences of 23 genes were followed by relatively long stretches of 3′-untranslated sequences and that 5 genes contained repetitive sequences in their 3′-untranslated regions. The chromosomal location of these genes has been determined. By increasing the scale of the above analysis, the coding sequences of many unidentified genes can be predicted.

Original languageEnglish
Pages (from-to)27-35
Number of pages9
JournalDNA Research
Volume1
Issue number1
DOIs
Publication statusPublished - 1994 Jan 1
Externally publishedYes

Keywords

  • Full-length cDNA sequence
  • Myeloid cell line KG-1
  • Protein motif
  • Unidentified human gene
  • cDNA library

ASJC Scopus subject areas

  • Molecular Biology
  • Genetics

Fingerprint Dive into the research topics of 'Prediction of the coding sequences of unidentified human genes. I. The coding sequences of 40 new genes (KIAA0001-KIAA0040) deduced by analysis of randomly sampled cDNA clones from human immature myeloid cell line KG-1'. Together they form a unique fingerprint.

  • Cite this