Abstract:
Networks of locally interacting elements pose fundamental
challenges for learning their structure from partial noisy data
and understanding how their elements cooperate to achieve a
joint function. Many such systems, ranging from cellular and
molecular pathways to communication and social networks, now
become amenable to computational studies, as high quality
empirical data rapidly accumulate.
I focus here on the problem of reconstructing and modeling
metabolic pathways, by relating diverse properties of enzymes to
their relative positions in the pathway. We develop compact and
interpretable probabilistic models for representing
protein-domain co-occurrences and gene expression time courses.
These models are then combined for identifying unknown enzymes
in the pathways, achieving accuracy that is significantly
superior to existing state-of-the-art approaches. By
systematically analyzing the relation between subgraphs of the
enzymatic network and the temporal expression profiles of
corresponding genes, we find that genes are timely regulated to
optimize metabolic performance in a changing environment,
suggesting a new organizational principle for regulation of
metabolic pathways.