Elucidating the molecular mechanisms of selective ligand recognition by proteins is a long-standing problem in drug discovery. Rapid increase in the availability of three-dimensional protein structural data indicates that a data-driven approach for finding the rules that govern protein-ligand interactions is increasingly attractive. However, this approach is not straightforward because of the complexity of molecular interactions and our inadequate understanding of the diversity of molecular interactions that occur during ligand recognition. Thus, we aimed to provide a comprehensive classification of the spatial arrangements of ligand atoms based on the local coordinates of each interacting "protein fragment" consisting of three atoms with covalent bonds in each amino acid. We used a pattern recognition technique based on the Gaussian mixture model and found 13 519 patterns in the spatial arrangements of interacting ligand atoms, each of which was described as a Gaussian function of the local coordinates. Some typical well-known interaction patterns such as hydrogen bonds were ubiquitous in several hundred protein families, whereas others were only observed in a few specific protein families. After removing protein sequence redundancy from the data set, we found that 63.4% of ligand atoms interacted via one or more interaction patterns and that 25.7% of ligand atoms interacted without patterns, whereas the remainder had no direct interactions. The top 3115 major patterns included 90% of the interacting pairs of residues and ligand atoms with patterns, while the top 6229 included all of them.
ASJC Scopus subject areas
- Chemical Engineering(all)
- Computer Science Applications
- Library and Information Sciences