The OpenDT project was created in 2003 by Robert E. Banfield, and represents one of the very few free software packages for
Machine Learning and Data Mining applications which are available for modification and redistribution. As the name would
suggest, the current focus is on Decision Trees; more specifically, an ensemble of decision trees. Machine learning has
benefited tremendously from the use of Multiple Classifier Systems, both in speed and in classification accuracy. An
ensemble is one such type of multiple classifier system.
Some features include:
- Bagging
- Boosting
- Random Attributes
- Random Forests
- Random Subspaces
- Missing Attributes
- C-Style Object-oriented code
- Fully distributed implementation which is transparent to the end-user
- Extremely fast classification time for even standard decision trees (5-10x faster than C4.5 while being nearly identical)
- and much more...
For a complete list of program details, consult the usage notes available within the program (/Doc).
To compile, simply enter the /Src directory and type 'gmake'.
To compile, for distributed systems, simply enter the /Src directory and type 'gmake mpi'.
To join in on the project, or to report any bugs, etc., please send me an email at
rbanfiel@csee.usf.edu.