|
In the paper "On Prediction Using Variable Order Markov Models" (
)
we studied and compared the performance of various prediction algorithms. Among
them are Context Tree Weighting (CTW), Prediction by Partial Match (PPM),
Probabilistic Suffix Trees (PST) and Lempel-Ziv (LZ78).
Ron Begleiter coded all these algorithms in Java and the code can be
downloaded
here.
In the paper "Online Choice of Active
Learning Algorithms" (
)
we propose a new meta-algorithm for active learning: operate a small ensemble of
active learners and switch between them online. Kobi Luz coded our algorithm as
well as other SVM-based active learners including algorithm "Simple" of Tong and
Koller and an algorithm by Roy and McCallum. The Java code can be downloaded
here.
An improved version of this Java code, as well as a Matlab wrapper were coded by
Ron Begleiter.
This recent implementation has two main components: Experimenter and Learner.
The Experimenter outputs a learning curve graph (for the given algorithm) based
on k-fold cross validation. The learner implements a standard active learner
interface ("learn", "query" and "classify"). The base code is a Java 1.4.* code.
We also provide a Matlab code (wrapper) for the learner component. All relevant
parameters are fully configurable via a textual configuration file. Press
here to get this code
as well as documentation.
 |
In the paper "Multi-Way Distributional Clustering via Pairwise Interactions" (
)
we propose a new clustering algorithm utilizing multiple
feature dimensions or modalities at once. This idea is implemented and made
efficient using a factored representation as used in graphical models and by
applying both top-down and bottom-up clustering. We report results on email
clustering, and new best clustering results on 20 Newsgroups.
Ron Bekkerman's
C++ implementation of the algorithm can be accessed from
here.
In the paper "Localized Boosting" (
)
we propose a new type of classifier boosting strategy where each weak learner
(or "expert") is explicitly restricted to be a specialist only in a certain
vicinity of the data space.
Gilad
Mishne programmed a nice
applet
demonstrating this algorithm as well as many other classifiers. The applet is
based on the Weka Data
Mining Package. The code of our applet can be downloaded
here.
Note: The code provided in this page is free software; you
can redistribute it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation; either version 2 of the
License, or (at your option) any later version. This code is distributed in the
hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied
warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
General Public License (GPL)
for more details.
|