Learning is a process of using experiences to
generate (or modify) knowledge
with the goal of improving the potential performance,
with respect to given criteria, of a problem solver that uses
that knowledge. When the benefit of the learned knowledge
is outweighed by its cost, we face the *utility problem*.
To acquire knowledge with high utility, a system must be
selective during learning.

In the paper Information filtering: Selective Processes In Learning Systems by Markovitch and Scott we describe the information filtering framework that views learning as a process where information flows from the experience space, through an attention procedure, through an acquisition procedure to a knowledge base, and from there to the problem solver. The framework defines 5 types of information filters according to their location within the information flow: selective experience, selective attention, selective acquisition, selective retention (forgetting) and selective utilization.

DIDO is a learning system that builds an egocentric model of its environment through experimentation. The paper Learning Novel Domains Through Curiosity and Conjecture by Scott and Markovitch gives a short account of the system. The system uses uncertainty-based selective experience to direct its learning. This methodology is described in the paper Experience Selection and Problem Choice In an Exploratory Learning System . The system builds conjectures readily, but uses selective retention filter to retract those that prove to be redundant or incorrect. This aspect of DIDO is described in the paper Representation Generation in An Exploratory Learning System.LASSY is a learning system that acquires domain specific knowledge to accelerate a PROLOG interpreter. A complete description of the system can be found in my thesis. LASSY uses both inductive and deductive methods for learnings. The deductive part was used to study the information filtering framework. It demonstrates the tradeoff between acquisition filters, retention filters and utilization filters. [This part is described here] The utilization filtering method is described in the paper Utilization Filtering: a method for reducing the inherent harmfulness of deductively learned knowledge.

Recently we have resumed our research on accelerating logic inference in LASSY's
framework. The paper
The Divide-and-Conquer Subgoal-Ordering Algorithm
for Speeding up Logic Inference
by Ledeniov and
Markovitch
describes a new algorithms which finds lowest cost ordering
exploiting the independencies between subgoals. The utility problem
arises when the cost of ordering outweighs the benefit of the
improved seqeunce. The paper
Learning Investment Functions for Controlling
the Utility of Control Knowledge
(an expanded version is also
available)
describes a method to overcome this problem.
The method involves converting the ordering procedure into
*anytime algorithm*, learning * resource investment functions *
and use them to control the resources spent by the ordering procedure.

Another work on utilization filtering in the context of macro-learning is described in the paper Utilization Filtering of Macros Based on Goal Similarity by Keidar, Markovitch and Webman.

The paper The Role of Forgetting in Learning studies the problem of selective retention in the context of macro learning.

Micro-Hillary is a selective macro-learning algorithm that uses an attention filter which considers only subpathes that connect local minima to states with better heuristic value. Micro-hillary is able to solve sliding-tile puzzles of size 50X50 after training with small self-generated problems. The paper A Selective Macro-learning Algorithm and its Application to the NxN Sliding-Tile Puzzle by Markovitch and Finkelstein describes the algorithm.

The goal of the LASSY system is to accelerate a PROLOG interpreter using acquired domain-specific knowledge. LASSY uses two learning mechanisms: deductive lemma learning and inductive learning for subgoal reordering. The paper Automatic Ordering of Subgoals - a Machine Learning Approach describes the algorithms used for reordering subgoals. The lemma-learning mechanism is described elsewhere. A new algorithm for subgoal ordering is presented in the paper The Divide-and-Conquer Subgoal-Ordering Algorithm for Speeding up Logic Inference. by Oleg Ledeniov and Shaul Markovitch. The Divide-and-Conquer algorithm exploits the dependencies between subgoals to accelerate the ordering process. The methods that we use to acquire control knowledge for the algorithm is described in the paper Learning Investment Functions for Controlling the Utility of Control Knowledge. .

One of the most common methods used for speedup learning is the acquisition of macro-operators. Micro-Hillary is a very successful speedup learner. A simple selective macro-learning algorithm is able to solve sliding-tile puzzles of any size. In the paper Systematic experimentation with deductive learning: Satisficing vs. optimizing search by Markovitch and Rosdeutscher we study macro-learning in grid domains.

When multiple agents act in the same
environment, an agent can benefit by
learning models of the other agents involved.
The paper
Learning models of intelligent agents,
by Carmel and
Markovitch,
views interaction
as *repeated game* (from game theory). Assuming that the
other agent uses a regular strategy, modeling
reduces to the problem of inducing finite automata
based on observations [A longer version of the paper
can be found here].
When an adaptive agent interacts with another agent, there are two
contradicting goals: Choosing actions that maximize the expected
utility, and choosing actions that maximize exposition of the opponent.
The paper
Exploration and Adaptation in Multiagent Systems: A Model-based Approach
describes methods to cope with this difficulty.
In the paper
Exploration Strategies for Model-based Learning in Multiagent Systems
we describe a novel look-ahead strategy that takes into account the risk involved in exploration.

The paper Incorporating Opponent models into adversary search, by Carmel and Markovitch, deals with interaction in zero-sum games. The paper describes a generalization of minimax that enables exploitation of recursive opponent models in adversary search. A longer version of the paper deals also with uncertain models. The paper Learning and Using Opponent Models describes a method for learning opponent models. The paper Pruning Algorithms for Multi-model Adversary Search describes some theoretical and experimental results related to pruning in multi-model search.

The paper Learning of Resource Allocation Strategies for game Playing , by Markovitch and Sella, studies the problem of distribution of resources over a sequence of actions. Given an unknown game, the agent learns classification rules that distinguish between "easy" and "difficult" states. [A longer version of the paper]. The paper Derivative Evaluation Function Learning Using Genetic Operators, by Lorenz and Markovitch, describes a method for applying genetic algorithms to evaluation-function learning.