Shaul Markovitch - Research <\header>

Shaul Markovitch - Research

Research summary

(Note: A paper hypertext link in the research summary will not download the paper but will rather take you to the paper's entry in the paper list. From there you will be able to get a postscript version, an abstract or a bibtex entry)

Selective Learning

Learning is a process of using experiences to generate (or modify) knowledge with the goal of improving the potential performance, with respect to given criteria, of a problem solver that uses that knowledge. When the benefit of the learned knowledge is outweighed by its cost, we face the utility problem. To acquire knowledge with high utility, a system must be selective during learning.

In the paper Information filtering: Selective Processes In Learning Systems by Markovitch and Scott we describe the information filtering framework that views learning as a process where information flows from the experience space, through an attention procedure, through an acquisition procedure to a knowledge base, and from there to the problem solver. The framework defines 5 types of information filters according to their location within the information flow: selective experience, selective attention, selective acquisition, selective retention (forgetting) and selective utilization.

DIDO is a learning system that builds an egocentric model of its environment through experimentation. The paper Learning Novel Domains Through Curiosity and Conjecture by Scott and Markovitch gives a short account of the system. The system uses uncertainty-based selective experience to direct its learning. This methodology is described in the paper Experience Selection and Problem Choice In an Exploratory Learning System . The system builds conjectures readily, but uses selective retention filter to retract those that prove to be redundant or incorrect. This aspect of DIDO is described in the paper Representation Generation in An Exploratory Learning System.

LASSY is a learning system that acquires domain specific knowledge to accelerate a PROLOG interpreter. A complete description of the system can be found in my thesis. LASSY uses both inductive and deductive methods for learnings. The deductive part was used to study the information filtering framework. It demonstrates the tradeoff between acquisition filters, retention filters and utilization filters. [This part is described here] The utilization filtering method is described in the paper Utilization Filtering: a method for reducing the inherent harmfulness of deductively learned knowledge.

Recently we have resumed our research on accelerating logic inference in LASSY's framework. The paper The Divide-and-Conquer Subgoal-Ordering Algorithm for Speeding up Logic Inference by Ledeniov and Markovitch describes a new algorithms which finds lowest cost ordering exploiting the independencies between subgoals. The utility problem arises when the cost of ordering outweighs the benefit of the improved seqeunce. The paper Learning Investment Functions for Controlling the Utility of Control Knowledge (an expanded version is also available) describes a method to overcome this problem. The method involves converting the ordering procedure into anytime algorithm, learning resource investment functions and use them to control the resources spent by the ordering procedure.

Another work on utilization filtering in the context of macro-learning is described in the paper Utilization Filtering of Macros Based on Goal Similarity by Keidar, Markovitch and Webman.

The paper The Role of Forgetting in Learning studies the problem of selective retention in the context of macro learning.

Micro-Hillary is a selective macro-learning algorithm that uses an attention filter which considers only subpathes that connect local minima to states with better heuristic value. Micro-hillary is able to solve sliding-tile puzzles of size 50X50 after training with small self-generated problems. The paper A Selective Macro-learning Algorithm and its Application to the NxN Sliding-Tile Puzzle by Markovitch and Finkelstein describes the algorithm.

Speedup Learning

Most of the research in the machine learning community deals with building learners that improve the classification accuracy of classifiers. A smaller part of the community study learners that improve the speed of problem solvers. This type of learning is called speed-up learning.

The goal of the LASSY system is to accelerate a PROLOG interpreter using acquired domain-specific knowledge. LASSY uses two learning mechanisms: deductive lemma learning and inductive learning for subgoal reordering. The paper Automatic Ordering of Subgoals - a Machine Learning Approach describes the algorithms used for reordering subgoals. The lemma-learning mechanism is described elsewhere. A new algorithm for subgoal ordering is presented in the paper The Divide-and-Conquer Subgoal-Ordering Algorithm for Speeding up Logic Inference. by Oleg Ledeniov and Shaul Markovitch. The Divide-and-Conquer algorithm exploits the dependencies between subgoals to accelerate the ordering process. The methods that we use to acquire control knowledge for the algorithm is described in the paper Learning Investment Functions for Controlling the Utility of Control Knowledge. .

One of the most common methods used for speedup learning is the acquisition of macro-operators. Micro-Hillary is a very successful speedup learner. A simple selective macro-learning algorithm is able to solve sliding-tile puzzles of any size. In the paper Systematic experimentation with deductive learning: Satisficing vs. optimizing search by Markovitch and Rosdeutscher we study macro-learning in grid domains.

Learning in Multi-Agent Systems

When multiple agents act in the same environment, an agent can benefit by learning models of the other agents involved. The paper Learning models of intelligent agents, by Carmel and Markovitch, views interaction as repeated game (from game theory). Assuming that the other agent uses a regular strategy, modeling reduces to the problem of inducing finite automata based on observations [A longer version of the paper can be found here]. When an adaptive agent interacts with another agent, there are two contradicting goals: Choosing actions that maximize the expected utility, and choosing actions that maximize exposition of the opponent. The paper Exploration and Adaptation in Multiagent Systems: A Model-based Approach describes methods to cope with this difficulty. In the paper Exploration Strategies for Model-based Learning in Multiagent Systems we describe a novel look-ahead strategy that takes into account the risk involved in exploration.

The paper Incorporating Opponent models into adversary search, by Carmel and Markovitch, deals with interaction in zero-sum games. The paper describes a generalization of minimax that enables exploitation of recursive opponent models in adversary search. A longer version of the paper deals also with uncertain models. The paper Learning and Using Opponent Models describes a method for learning opponent models. The paper Pruning Algorithms for Multi-model Adversary Search describes some theoretical and experimental results related to pruning in multi-model search.

The paper Learning of Resource Allocation Strategies for game Playing , by Markovitch and Sella, studies the problem of distribution of resources over a sequence of actions. Given an unknown game, the agent learns classification rules that distinguish between "easy" and "difficult" states. [A longer version of the paper]. The paper Derivative Evaluation Function Learning Using Genetic Operators, by Lorenz and Markovitch, describes a method for applying genetic algorithms to evaluation-function learning.

Learning in Natural Language Processing

The paper Contextual word similarity and estimation from sparse data, by Dagan, Marcus and Markovitch describes a method for estimating the probability of word co-occurance based on similar words. The similarity between two words is determined by the similarity of their context pattern. [A longer version of the paper can be found here].