Data sets

Below are some data sets that may be used in connection with some of the exercises in the book.

Training sets

Missing data

Hidden variable

Structural constraint

The KDD cups and the UCI machine learning repository are sources of other data sets frequently used in the machine learning/data mining litterature.

KDD Cup data

UCI data repository


Last modified: Tue Jun 19 20:13:08 2007