Knowledge Extraction and Modeling
Department of Mathematics and Statistics
University of Naples - Federico II

The objective of the Workshop is to give an overview on the theme of "Knowledge Extraction & Modeling" with up-to-date lectures showing the state-of-art but also the most recent advances and future challenges. The Workshop is aimed at focusing on a theme that is not yet firmly established in literature or research.

Namely, the Workshop is meant to address the analysis of "complex systems" where the difficulty of analysis is not only the availability of huge masses of data but also the complex structure of relationships. It is somehow the problem of extracting information from models, not just data.
Several statistical techniques for exploring a data structure are naturally interpretable in the context of the following operational model:

Data = Model + Error

where the sign "+" does not necessarily refer to an additive relation. This model reflects an exploratory context where, usually, a random part (the error) is combined with a structural one (the model). Once the data have been cleaned and either a class of models have been identified or a specific model has been estimated, the information on models (e.g. parameter estimates, fitness values, nature of the model, etc.) actually becomes the metadata to use for the extraction of further knowledge.

The challenge consists in considering the interaction between Knowledge Extraction and Modeling by investigating two possible directions:

  1. Knowledge Extraction from models
    When a classification of the statistical units is known a priori, a model can be generated for each segment in the population and the significance of the difference between models can be assessed. A further step will have to measure, ex-post, the distance between models by exploiting the relative results (parameter estimates, classification rules, segmentation rules, association rules, fitness measures, etc.) as the set of meta-data where to graft the knowledge extraction process.
    It might be the case, for instance, of having different structural equation models being estimated for the satisfaction of thousands segments of consumers and want to find a way to compare them, or of having a considerable number of time series models and want to find a way to classify them into a hierarchy (of models, not observations!!!!).

  2. Knowledge Extraction by modeling
    The population segmentation may be implied by identifying local models with highest distance between them and such that the best fit is ensured within the groups. For instance, the structural relationships between variables observed on a population may emerge by means of searching for different regression models thus determining a segmentation that was not known a priori.

International Association for
Statistical Computing

The Interface Foundation of
North America

International Federation of
Classification Societies

Villa Orlandi - Island of Capri, Italy
September, 4th-6th 2006