Kira Radinsky, Sagie Davidovich and Shaul Markovitch. Learning Causality for News Events Prediction. In Proceedings of WWW 2012, 909-918 Lyon, France, 2011.
The problem we tackle in this work is, given a present news event, to generate a plausible future event that can be caused by the given event. We present a new methodology for mod- eling and predicting such future news events using machine learning and data mining techniques. Our Pundit algorithm generalizes examples of causality pairs to infer a causality predictor. To obtain precise labeled causality examples, we mine 150 years of news articles, and apply semantic natural language modeling techniques to titles containing certain predefined causality patterns. For generalization, the model uses a vast amount of world knowledge ontologies mined from LinkedData, containing 200 datasets with approximately 20 billion relations. Empirical evaluation on real news articles shows that our Pundit algorithm reaches a human-level performance.
@inproceedings{Radinsky:2012:LCN, Author = {Kira Radinsky and Sagie Davidovich and Shaul Markovitch}, Title = {Learning Causality for News Events Prediction}, Year = {2011}, Booktitle = {Proceedings of WWW 2012}, Pages = {909--918}, Address = {Lyon, France}, Url = {http://www.cs.technion.ac.il/~shaulm/papers/pdf/Radinsky-Davidovich-Markovitch-WWW2012.pdf}, Keywords = {Information Retrieval, Temporal Reasoning, Prediction}, Secondary-keywords = {Common-Sense Knowledge}, Abstract = { The problem we tackle in this work is, given a present news event, to generate a plausible future event that can be caused by the given event. We present a new methodology for mod- eling and predicting such future news events using machine learning and data mining techniques. Our Pundit algorithm generalizes examples of causality pairs to infer a causality predictor. To obtain precise labeled causality examples, we mine 150 years of news articles, and apply semantic natural language modeling techniques to titles containing certain predefined causality patterns. For generalization, the model uses a vast amount of world knowledge ontologies mined from LinkedData, containing 200 datasets with approximately 20 billion relations. Empirical evaluation on real news articles shows that our Pundit algorithm reaches a human- level performance. } }