Background: With the advent of metabolomics as a powerful tool for both functional and biomarker discovery, the identification of specific differences between complex metabolite profiles is becoming a major challenge in the data analysis pipeline. The task remains difficult, given the datasets' size, complexity, and common shifts in migration (elution/retention) times between samples analyzed by hyphenated mass spectrometry methods. Results: We present a Mathematica (Wolfram Research, Inc.) package MathDAMP (Mathematica package for Differential Analysis of Metabolite Profiles), which highlights differences between raw datasets acquired by hyphenated mass spectrometry methods by applying arithmetic operations to all corresponding signal intensities on a datapoint-by-datapoint basis. Peak identification and integration is thus bypassed and the results are displayed graphically. To facilitate direct comparisons, the raw datasets are automatically preprocessed and normalized in terms of both migration times and signal intensities. A combination of dynamic programming and global optimization is used for the alignment of the datasets along the migration time dimension. The processed datasets and the results of direct comparisons between them are visualized using density plots (axes represent migration time and m/z values while peaks appear as color-coded spots) providing an intuitive overall view. Various forms of comparisons and statistical tests can be applied to highlight subtle differences. Overlaid electropherograms (chromatograms) corresponding to the vicinities of the candidate differences from any result may be generated in a descending order of significance for visual confirmation. Additionally, a standard library table (a list of m/z values and migration times for known compounds) may be aligned and overlaid on the plots to allow easier identification of metabolites. Conclusion: Our tool facilitates the visualization and identification of differences between complex metabolite profiles according to various criteria in an automated fashion and is useful for data-driven discovery of biomarkers and functional genomics.
ASJC Scopus subject areas
- Structural Biology
- Molecular Biology
- Computer Science Applications
- Applied Mathematics