TiMBL is an open source software package implementing several memory-based learning algorithms, among which IB1-IG, an implementation of k-nearest neighbor classification with feature weighting suitable for symbolic feature spaces, and IGTree, a decision-tree approximation of IB1-IG. All implemented algorithms have in common that they store some representation of the training set explicitly in memory. During testing, new cases are classified by extrapolation from the most similar stored cases.
For over fifteen years TiMBL has been mostly used in natural language processing as a machine learning classifier component, but its use extends to virtually any supervised machine learning domain. Due to its particular decision-tree-based implementation, TiMBL is in many cases far more efficient in classification than a standard k-nearest neighbor algorithm would be.
Features
- Fast, decision-tree-based implementation of k-nearest neighbor classification
- Implementations of IB1 and IB2, IGTree, TRIBL, and TRIBL2 algorithms
- Similarity metrics: Overlap, MVDM, Jensen-Shannon and Jeffrey Divergence, Dot product, Cosine
- Feature weighting metrics: information gain, gain ratio, chi squared, shared variance
- Per-value similarity metrics: Levenshtein, Dice coefficient
- Distance weighting metrics: inverse, inverse linear, exponential decay
- Multi-CPU support
- Extensive verbosity options to inspect nearest neighbor sets
- Server functionality and extensive API
- Fast leave-one-out testing and internal cross-validation
- Handles user-defined example weighting
Download & Installation
Timbl is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation.
To download and install Timbl:
- First check if there are up-to-date packages included in your distribution's package manager. There are packages for Alpine Linux, Arch Linux (AUR), macOS (homebrew), Debian and derivates like Ubuntu.
- If not, we recommend you use our docker container via docker pull proycon/timbl. It includes Timbl and all necessary dependencies.
- Alternatively, you can always download, compile and install Timbl manually, as shown next.
Manual installation
To compile Timbl manually consult the included INSTALL document, you will need current versions of the following dependencies of our software:
- ticcutils - A shared utility library
As well as the following 3rd party dependencies:
- A sane build environment with a C++ compiler (e.g. gcc or clang), autotools, libtool, pkg-config
Documentation
- Book: Memory-Based Language Processing - Daelemans, W., and Van den Bosch, A. (2005). Cambridge, UK: Cambridge University Press.
- Reference Guide; Daelemans, W., Zavrel, J., Van der Sloot, K., and Van den Bosch, A. (still in edit). TiMBL: Tilburg Memory Based Learner, version 6.4, Reference Guide.
- API guide (34 pages, 129 kB PDF); Van der Sloot, K. (2010). TiMBL: Tilburg Memory Based Learner, version 6.3, API Guide. ILK Research Group Technical Report Series no. 10-03.
- TimblServer Manual (12 pages, 62 Kb PDF); Van der Sloot, K. (2010). TimblServer: Tilburg Memory-Based Learner Server, version 1.0, Manual. ILK Research Group Technical Report Series no. 10-02.
Extensions
Several wrappers, bindings and other extensions to TiMBL have been developed:
- TimblServer - TiMBL wrapper, adds server functionality to TiMBL
- python-timbl - Python language binding for TiMBL
- Dimbl - Parallel TiMBL for multi-core processing; parallelizes by splitting the training set
- paramsearch - Automatic hyperparameter optimization for TiMBL (and other ML algorithms)
- rtimbl - a Ruby interface to TiMBL
- Timpute - TiMBL-based data imputation
- knngraph - Visualizes nearest neighbors in a TiMBL instance base
TiMBL is a core component of various NLP software systems such as MBT (memory-based tagger generator), Frog (Dutch morpho-syntactic analyzer), Gecco (Context-sensitive spelling corrector, used by Valkuil.net for Dutch, and Fowlt.net for English), and SoothSayer (Dutch word completion).
The development and improvement of Frog also relies on your bug reports, suggestions, and comments. Use the github issue tracker or mail lamasoftware (at) science.ru.nl.