Abstract:
Many eukaryotic genes are fragmented along the DNA, intervened
by noncoding segments called introns. Until recently, the prevailing
concept held that introns evolve by nonadaptive forces as slightly
deleterious elements that lack any function. However, evidence has been
accumulating showing that there are anecdotal exceptions to this
concept. In my talk, I will present a thorough large-scale study that
implies a functional role for a previously unappreciated large fraction
of introns, highlighting their importance to the rapidly rising interest
in functional noncoding elements. To this end, I will lay out a
comprehensive model for the evolution of gene architecture, and will
introduce an intron-exon data set that is significantly larger than
previously studied ones. To obtain a definitive reconstruction of gene
architecture evolution, we interpret the phylogenetic tree as a
graphical model, and develop an expectation-maximization algorithm to
estimate the parameters of the model. We use a realization of the
junction tree algorithm to compute the sufficient statistics that is
required for the expectation step.
This work culminated in several observations, some of which revise
common beliefs in the field. Taken together, these findings put forward
the possibility that once introns had invaded early eukaryotic genomes
in an arguably nonadaptive fashion, many were exploited in novel ways,
gradually gaining diverse functions, up to the point that probably only
a few of today's eukaryotes could survive without them. The results of
this study were integrated with whole-genome multivariate analysis at
the systems level, showing that genes with high expression level and low
sequence evolutionary rate have a tendency to accumulate introns.
These ideas gain further credence from the fact that the positions of
many exon boundaries are known to be shared between distant eukaryotic
taxa, e.g., in my data set 25% of the intron positions are shared
between plants and animals. This observation can be explained by either
remarkable conservation of ancient introns or by parallel, independent,
intron gain at the same positions. Using my algorithm, a calculation of
the relative contributions of the two factors reveals that shared
ancestry is by far the dominant one (for example, more than 80% of the
introns shared by plants and animals are due to shared ancestry). While
a mechanistic explanation cannot be ruled out, such an impressive
endurance of a substantial fraction of the introns is likely to reflect
their functional importance. Consequently, I suggest using conserved
intron positions as a novel tool for identifying functional noncoding
elements.