Tutorial Title:  Data Mining

The information in the world doubles about every 20 months. Among the fastest growing sources of data are the internet, industrial process supervision systems, business data bases, bio technology and automatic imaging. Besides data acquisition and storage, data processing and exploitation are the biggest challenges. The goal is to extract the relevant information (the "knowledge") from large data sets. For this information extraction process, conventional (linear) statistical methods such as correlation and regression are applied, but also methods from cluster analysis, neural computation, and machine learning.
The collection of these data analysis methods is called "data mining". Data mining is part of the knowledge discovery process, which also covers preprocessing, filtering, visualization, transformation and feature generation.
The tutorial introduces some of the most important methods for data mining and knowledge discovery and presents some real-world application examples.
It is structured as follows:

1. introduction and definitions
2. data sources, characteristics, and distortions
3. preprocessing and filtering
4. visualization (projections, principal component analysis, multidimensional scaling, self-organizing maps)
5. data transformation and feature generation
6. data analysis (correlation, spurious correlation, regression, classification, decision trees, ID3, sequential clustering, c-means and its relatives, cluster estimation, radial basis functions, vector quantization)
7. application examples

Hard copies of the tutorial material will be provided.