The Taub Faculty of Computer Science Events and Talks
Aviv A. Rosenberg (Ph.D. Thesis Seminar)
Sunday, 12.02.2023, 14:30
Advisor: Prof. Alex M. Bronstein
Proteins fold from a sequence of amino acids, forming secondary structures which subsequently fold into a three-dimensional structure that enables their function. The amino acid sequence is defined in the genetic sequence as codons, many of which are synonymous, i.e., they code for the same amino acid. The "one sequence, one structure" dogma, established over half a century ago, remains the commonly accepted notion, and implies that synonymous coding is inconsequential to protein structure. This talk will present results from three different works, which challenge this dogma through large-scale computational analysis of protein structures.
First, we develop novel methods for computing and comparing codon-specific protein backbone angle distributions. We design a non-parametric approach for comparing these bivariate distributions using finite samples, and identify synonymous codon distributions which are distinguishable, with statistical significance, within some secondary structures. This demonstrates, for the first time, an association between synonymous codon usage and the final protein structure around the amino acids they translate into.
Next, we expand this approach to consider pairs of amino acids, accounting for the peptide bond which is formed between amino acids during translation of the genetic code. To that end, we introduce a tool for defining local, two amino acid-long sub-secondary structural units. We analyze the joint distribution of backbone angles across the peptide bond and show that our structural units can more meaningfully represent backbone conformations than conventional secondary structure.
Finally, building on the aforementioned tools, we devise a constructive approach for pinpointing locations in highly similar protein structures having vastly different local backbone conformations despite residing in environments with an identical sequence and potential interaction network. We show that such conformational differences are stable under molecular dynamics simulations, and that they are not predicted by AlphaFold, a state-of-the-art structure prediction model which relies only on the amino acid sequence. Our data-driven approach provides biologists with invaluable dogma-defying examples, guiding further research into the mechanisms behind protein folding.