Aviram Magen, M.Sc. Thesis Seminar
Thursday, 14.3.2019, 12:00
Billions of dollars a year are spent to develop a new drug. The first step in the drug development process is drug discovery, where in potential molecules are studied before leads are selected and moved to clinical trials.
Extensive research is performed to identify studies that might be relevant to the potential molecule or its substructures.
As the molecule is novel, no respective research has been published about it, and therefore identifying relevant papers about it’s different characteristics is highly challenging. In this paper, we present the novel task of ranking documents based on novel molecule queries. Given a chemical molecular structure, we wish to rank medical papers that will contribute to a researcher’s understanding of the novel molecule’s medical potential.
We present a set of ranking algorithms and molecular embeddings to address the task.An extensive evaluation of the algorithms is performed over the molecular embeddings, studying their performance on a benchmark retrieval corpus, which we share with the community. Additionally, we introduce a heterogeneous edge-labeled graph embedding approach to address the molecule ranking task. Our evaluation shows that the proposed embedding model can significantly improve molecule ranking methods.