Part 4 addresses visual and spatial data mining and consists of Chapters 11-16. Chapter 11 introduces two dynamic visualization techniques using multi-dimensional scaling to analyze transient data streams such as newswires and remote sensing imagery. The chapter presents an adaptive visualization technique based on data stratification to ingest stream information adaptively when influx rate exceeds processing rate. It also describes an incremental visualization technique based on data fusion to project new information directly onto a visualization subspace spanned by the singular vectors of the previously processed neighboring data. The ultimate goal is to leverage the value of legacy and new information and minimize re-processing of the entire dataset in full resolution
In Chapter 12, the main objective of the described spatial data mining platform called SPIN! is to provide an open, highly extensible, n-tier system architecture based on the Java 2 Platform, Enterprise Edition. The data mining functionality is distributed among (i) Java client application for visualization and workspace management, (ii) application server with Enterprise Java Bean container for running data mining algorithms and workspace management, and (iii) spatial database for storing data and spatial query execution. In the SPIN! system, visual problem solving involves displaying data mining results, using visual data analysis tools, and finally producing a solution based on linked interactive displays with different visualizations of various types of knowledge and data.
Chapter 13 begins by looking at the Predictive Model Markup Language (PMML), an XML-based industrial standard for the platform- and system-independent representation of data mining models. VizWiz, a tool for the visualization and evaluation of data mining models that are specified in PMML, is presented. This tool allows for the highly interactive visual exploration of a variety of data mining result types such as decision trees, classification and association rules or subgroups.
Chapter 14 describes new neural-network techniques developed for the visual mining of clinical electroencephalograms (EEGs). These techniques exploit the fruitful ideas of the Group Method of Data Handling (GMDH). The chapter briefly describes the standard neural-network techniques that are able to learn well-suited classification modes from data presented by relevant features. It then introduces and applies an evolving cascade neural network technique that adds new input nodes as well as new neurons to the network while the training error decreases. The chapter also presents the GMDH-type polynomial networks trained from data. New neural-network techniques developed to derive multi-class concepts from data are described and applied.
Chapter 15 discusses how to represent scientific visualization and data mining tasks in a simpler form so that visual solutions become possible. Visualization is used in data mining for the visual presentation of already discovered patterns and for discovering new patterns visually. Success in both tasks depends on the ability of presenting abstract patterns as simple visual patterns. A new approach called inverse visualization (IV) is suggested for addressing the problem of visualizing complex patterns. The approach is based on specially designed data preprocessing, which is based on a transformation theorem proved in this chapter. A mathematical formalism is derived from the Representative Measurement Theory. The possibility of solving inverse visualization tasks is illustrated on functional non-linear additive dependencies.