The Cheminformatics & Chemogenomics Research Group (CCRG) focuses on the development of algorithms and tools for large scale integrative data mining of drug discovery, chemical & biological data, and is led by David Wild. It has three main research threads:

(i) Integration of public chemical and biological datasets using Semantic Web technologies. Our main contribution is the integrated RDF-based resource called Chem2Bio2RDF that allows integrative searching of large amounts of publicly-accessible drug discovery data pertaining to chemical compounds, drugs, targets, genes, diseases, pathways and drug side effects. It is described in our 2008 BMC Bioinformatics paper. We have also developed the Chem2Bio2OWL ontology for chemogenomic annotation of the set.

(ii) Development of novel algorithms and tools for integrated data mining in drug discovery. Major contributions include graph-based association finding between chemical and biological entities (see the association search tool); the BioLDA topic modeling method (see our 2011 PLoS ONE paper); the Semantic Linked Association Prediction method for probabilistic prediction of new compound-gene and other relationships (publication in preparation); BARD method for protein polypharmacology prediction (see our 2011 Bioinformatics paper); the WENDI tool for new compound profiling (see our 2010 Journal of Cheminformatics paper); and rule-based semantic inference of compound disease relationships (see our 2011 BMC Bioinformatics paper).

(iii) Applying integrative methods to drug discovery and related problems via external collaborations. This is the most recent aspect of our work. Current major efforts include:  integrative virtual screening for PXR antagonists resulting in novel antagonists (collaboration with University of Cincinnati); application of integrative methods to discover novel drugs for Malaria (collaboration with OSDD); collaborations with Eli Lilly and Pfizer for application in drug discovery; and exploring networks of natural products for aroma research.