A Word at a Time: Computing Word Relatedness using Temporal Semantic Analysis

Kira Radinsky, Eugene Agichtein, Evgeniy Gabrilovich and Shaul Markovitch. A Word at a Time: Computing Word Relatedness using Temporal Semantic Analysis. In Proceedings of the 20th International World Wide Web Conference, 337-346 Hyderabad, India, 2011.

Abstract

Computing the degree of semantic relatedness of words is a key functionality of many language applications such as search, clustering, and disambiguation. Previous approaches to computing semantic relatedness mostly used static language resources, while essentially ignoring their temporal aspects. We believe that a considerable amount of relatedness information can also be found in studying patterns of word usage over time. Consider, for instance, a newspaper archive spanning many years. Two words such as ``war'' and ``peace'' might rarely co-occur in the same articles, yet their patterns of use over time might be similar. In this paper, we propose a new semantic relatedness model, Temporal Semantic Analysis (TSA), which captures this temporal information. The previous state of the art method, Explicit Semantic Analysis (ESA), represented word semantics as a vector of concepts. TSA uses a more refined representation, where each concept is no longer scalar, but is instead represented as time series over a corpus of temporally-ordered documents. To the best of our knowledge, this is the first attempt to incorporate temporal evidence into models of semantic relatedness. Empirical evaluation shows that TSA provides consistent improvements over the state of the art ESA results on multiple benchmarks.

Keywords: Semantic Relatedness, ESA, Explicit Semantic Analysis, Temporal Reasoning

Secondary Keywords:

Online version:

Bibtex entry:

 @inproceedings{Radinsky:2011:WTS,
  Author = {Kira Radinsky and Eugene Agichtein and Evgeniy Gabrilovich and Shaul Markovitch},
  Title = {A Word at a Time: Computing Word Relatedness using Temporal Semantic Analysis},
  Year = {2011},
  Booktitle = {Proceedings of the 20th International World Wide Web Conference},
  Month = {March},
  Pages = {337--346},
  Address = {Hyderabad, India},
  Url = {http://www.cs.technion.ac.il/~shaulm/papers/pdf/Radinsky-WWW2011.pdf},
  Keywords = {Semantic Relatedness, ESA, Explicit Semantic Analysis, Temporal Reasoning},
  Abstract = {
    Computing the degree of semantic relatedness of words is a key
    functionality of many language applications such as search,
    clustering, and disambiguation. Previous approaches to computing
    semantic relatedness mostly used static language resources, while
    essentially ignoring their temporal aspects. We believe that a
    considerable amount of relatedness information can also be found
    in studying patterns of word usage over time. Consider, for
    instance, a newspaper archive spanning many years. Two words such
    as ``war'' and ``peace'' might rarely co-occur in the same
    articles, yet their patterns of use over time might be similar. In
    this paper, we propose a new semantic relatedness model, Temporal
    Semantic Analysis (TSA), which captures this temporal information.
    The previous state of the art method, Explicit Semantic Analysis
    (ESA), represented word semantics as a vector of concepts. TSA
    uses a more refined representation, where each concept is no
    longer scalar, but is instead represented as time series over a
    corpus of temporally-ordered documents. To the best of our
    knowledge, this is the first attempt to incorporate temporal
    evidence into models of semantic relatedness. Empirical evaluation
    shows that TSA provides consistent improvements over the state of
    the art ESA results on multiple benchmarks.
  }

  }