Lior Friedman, M.Sc. Thesis Seminar
Induction algorithms have steadily improved over the years, resulting in powerful methods for learning. However, these methods are constrained to use knowledge within the supplied feature vectors.
Recently, a large collection of common-sense and domain specific relational knowledge bases have become available on the web. The natural question is how these knowledge bases can be exploited by existing induction algorithms.
In this work we propose a novel algorithm for using relational data to generate recursive features. Given a feature, the algorithm recursively defines a new learning task over its set of values, and uses the relational data to construct feature vectors for the new task. The resulting classifier is then added as a new feature.
We have applied our algorithm to the domain of text categorization, using large semantic knowledge bases such as YAGO. We have shown that generated recursive features significantly improve the performance of existing induction algorithms.