Academia.eduAcademia.edu
ConMedNP: a natural product library from Central African medicinal plants for drug discovery. Fidele Ntie-Kang, Pascal Amoa Onguene, Michael Scharfe, Luc C. Owono Owono, Eugene Megnassan, Luc Meva’a Mbaze, Wolfgang Sippl and Simon M. N. Efange Abstract We assess the medicinal value and “drug-likeness” of ~3200 compounds of natural origin, along with some of their derivatives which were obtained through hemisynthesis. In the present study, 376 distinct medicinal plant species belong to 79 plant families from the Central African flora have been considered, based on data retrieved from literature sources. For each compound, the optimised 3D structure has been used to calculate physicochemical properties which determine oral availability on the basis of Lipinski’s “Rule of Five”. A comparative analysis has been carried out with the “drug-like”, “lead-like”, and “fragment-like” subsets, containing respectively 1726, 738 and 155 compounds, as well as with our smaller previously published CamMedNP library and the Dictionary of Natural products. A diversity analysis has been carried out in comparison with the DIVERSet™ Database (containing 48,651 compounds) from ChemBridge. Our results prove that drug discovery, beginning with natural products from the Central African flora, could be promising. The 3D structures are available and could be useful for virtual screening and natural product lead generation programs. Background 1. Phytomedicine is a part of health care systems around the world. 2. The use of computer-aided drug design (CADD) methods has become a very important part of the drug discovery process. 3. Natural products play a key role in drug discovery programs, both serving as drugs and as templates for the synthesis of drugs, even though the quantities and availabilities of samples for screening are often limited. Objectives 1. Generate a virtual library of 3D structures Congo Basin natural product. 2. Calculate its physicochemical properties 3. Apply Lipinski’s ‘‘Rule of Five’’ to evaluate likely oral availability of the samples. Congo basin tropical forest Materials and methods Data Sources The plant sources, geographical collection sites, chemical structures of pure compounds as well as their spectroscopic data, were retrieved from literature sources comprising of 31 PhD theses and journal articles, with references ranging from 1971 to 2013. Generation of 3D Models, Optimization and Calculation of Molecular Descriptors Based on the known chemical structures of the NPs, all 3D molecular structures were generated using the graphical user interface (GUI) of the MOE software running on a Linux workstation with a 3.5GHz Intel Core2 Duo processor. The 3D structures were generated using the builder module of MOE and energy minimization was subsequently carried out using the MMFF94 force field until a gradient of 0.01 kcal/mol was reached. The MW, NRB, log P, log S, HBA, HBD, THSA, TPSA, NO, NCC, NR and number of Lipinski violations were calculated using the molecular descriptor calculator included in the QuSAR module of the MOE package. The ChemBridge Diverset dataset (48,651 compounds) was downloaded from the official ChemBridge webpage. The LibMCS program of JKlustor was used for maximum common substructure clustering of the ConMedNP database. In the MCSS search, only structures with MW ≤ 600 were included, since MCSS clustering is only feasible on small molecules. This means, only 2785 of the compounds of the ConMedNP were analyzed. The compounds were fragmented using the RECAP algorithm. Results and discussion 1. Origin and description of secondary metabolites The plant sources from which the 3,177 secondary metabolites have been isolated or derived are 376 species belonging to 79 families, from the Congo Basin rainforest in Central Africa, 31.60% of the compounds being isolated or derived from plants in Central Africa for the very first time. Most of the compounds were isolated from plants of the Leguminosae (14.38%), Moraceae (9.44%), and Guttiferae (9.40%) families, Figure 1. A general classification of the collected compounds showed that a majority of the compounds are terpenoids (24.41%), followed by flavonoids (18.52%) and alkaloids (15.17%), Figure 2. 2. Discussion of Lipinski’s criteria and property distribution Lipinski’s “Rule of Five” (ro5) is often considered as a useful filter for the elimination of “non drug-like” compounds in the early stages of drug discovery protocols . It was noted that 50.02% of the compounds within ConMedNP were Lipinski compliant and 79.61% showed < 2 violations (Figure 3). Figure 1:plants family Figure 2: Class of compounds 3. Comparison with the Dictionary of Natural Products and the CamMedNP library The overall summary of the four Lipinski parameters for the three datasets thus reveals that both CamMedNP and ConMedNP libraries are more “drug-like” than the DNP. This is an indication that the chances of finding “lead-like” molecules with improved DMPK properties within these libraries are quite significant. 4. Diversity analysis Histograms showing the calculated descriptors (MW, HBA, HBD, log P, NR, NRB, NN, NO, NRB and TPSA are shown in Figure 4 for ConMedNP (in light green) and the ChemBridge dataset (in red). The most common Figure 3: Lipinski violation Figure 4: PCA descriptors substructure selection (MCSS) panel for compound selection (Figure 5) is based on substructures that can be synthetically combined and are common in “drug-like” molecules and allows a direct selection and identification of compounds containing such substructures Conclusion Figure 5: most common substructure selection Virtual screening workflows often involve docking a compound library into the binding site of a target receptor and using scoring functions and binding free energy calculations to identify putative binders. The availability of 3D structures of the compounds to be used for docking is therefore of utmost importance. To the best of our knowledge, ConMedNP represents the largest “drug-like”, “lead-like”, “fragment-like” and diverse collection of 3D structures of NPs from the Central African forest, readily available for download. This dataset has the advantages that it is relatively small, “drug-like”, diverse and easily assessable for virtual screening purposes. Thus the availability of such structures within ConMedNP, as well as their calculated physico-chemical properties and indicators of “drug-likeness” will facilitate the drug discovery process from leads that have been identified from Central African medicinal plants. A typical example for a drug discovery effort for a wide range of diseases beginning from a Chinese natural products chemical library has been recently described. Thus, the small “fragment-like” subset of 155 compounds, derived from ConMedNP, could serve as a suitable base line for fragment-based drug design projects. Acknowledgements We are very grateful to the Royal Society of Chemistry for the travel grant awarded to PAO. View publication stats