TY - JOUR
T1 - Locus-specific mutation databases
T2 - Pitfalls and good practice based on the p53 experience
AU - Soussi, Thierry
AU - Ishioka, Chikashi
AU - Claustres, Mireille
AU - Béroud, Christophe
N1 - Funding Information:
Historically, collections of mutations and variations in human genes have been reported in the published literature. In the mid-1980s, several of these variations were available in the form of various databases, such as the Genome DataBase (GDB)14, GenBank15, the European Molecular Biology Laboratory (EMBL)16 and Swiss-Prot17. However, because of the structure of these databases, the extraction of relevant information concerning mutations was almost impossible, so specific software had to be developed to facilitate this. Several teams started to develop specific databases to collect and document mutations in human genes. Today, several hundred locus-specific databases (LSDBs) are available through the Internet and have been recently reviewed4. Many of these databases are just a simple list of mutations that cannot be searched. They are also highly heterogeneous in terms of quality and content4. One of the main problems of these LSDBs concerns their follow-up. A recent survey of 138 known LSDBs of human gene mutations with available follow-up information found that 40 databases have not been updated since the year 2000 and that another 44 had only been updated between 2001 and 2003. Although the creation of a mutation database can be exciting and gratifying (in terms of publication), follow-up is time-consuming and less stimulating. As recently highlighted in a special report in Nature, financial issues are also involved and several databases, including the Asthma and Allergy gene database, were closed due to lack of funding18. These LSDBs are sponsored by only a few grants and they are usually developed ‘on the side’ (the Universal Mutation Database (UMD) for p53, created in 1991, only received one grant from a charity organization in 1995)19. Projects to generate central databases that
PY - 2006/1
Y1 - 2006/1
N2 - Between 50,000 and 60,000 mutations have been described in various genes that are associated with a wide variety of diseases. Reporting, storing and analysing these data is an important challenge as such data provide invaluable information for both clinical medicine and basic science. Locus-specific databases have been developed to exploit this huge volume of data. The p53 mutation database is a paradigm, as it constitutes the largest collection of somatic mutations (22,000). However, there are several biases in this database that can lead to serious erroneous interpretations. We describe several rules for mutation database management that could benefit the entire scientific community.
AB - Between 50,000 and 60,000 mutations have been described in various genes that are associated with a wide variety of diseases. Reporting, storing and analysing these data is an important challenge as such data provide invaluable information for both clinical medicine and basic science. Locus-specific databases have been developed to exploit this huge volume of data. The p53 mutation database is a paradigm, as it constitutes the largest collection of somatic mutations (22,000). However, there are several biases in this database that can lead to serious erroneous interpretations. We describe several rules for mutation database management that could benefit the entire scientific community.
UR - http://www.scopus.com/inward/record.url?scp=30144433850&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=30144433850&partnerID=8YFLogxK
U2 - 10.1038/nrc1783
DO - 10.1038/nrc1783
M3 - Review article
C2 - 16397528
AN - SCOPUS:30144433850
VL - 6
SP - 83
EP - 90
JO - Nature Reviews Cancer
JF - Nature Reviews Cancer
SN - 1474-175X
IS - 1
ER -