**HSSS Research Kitchen on**

**Learning Conditional Independence
Models **

** 16 - 20 October 2000**

** Trest, Czech Republic**

..............................................................................................

Report to European Science Foundation

The following summarizes the discussions as they took place during the meeting. Every session was devoted to a specific topic. The number of speakers in one session varied from one to four but the number of disccussants was higher.

Monday 16 October 2000
* Survey and problem classification * (MS)

- Various graphical and non-graphical approaches to description of CI structures
- Soundness and completeness
- Equivalence problem, inclusion problem, representative choice problem
- Interpretation, learning and computational aspects

** Tuesday 17 October 2000 **
* Learning strategies * (PG, EF, CT, GK)

* 1. MCMC learning graphical models * (PG)

- Interplay between computations and learning
- Local specifications and computations
- Analysis of decomposable UG models
- Multivariate Gaussian and contingency table models
- Hyper Markov prior distributions
- Simultaneous quantitative and structural learning by means of MCMC
- Convergence diagnostics for MCMC

* 2. MCMC learning for DAG models *(EF)

- Choice of parametrization
- Representative graph model search versus full DAG space model search

* 3. Analysis of graphical factor models * (CT)

- Definition of graphical factor models
- Importance of identifiability
- Matrix representation of models
- A priori rules to move betwen identifiable models

* 4. Methodological aspects in learning * (GK)

- Precision attached to the Bayesian approach
- Specification of prior distributions over model space
- Essential graph representative
- Smoothing and shrinkage aspects
- Decomposition of problems

Wednesday 18 October 2000
* Inclusion problem * (RB, TK)

*1. Inclusion problem I.* (RB)

- Counterexamples to previous attempts
- Non-locality aspect

* 2. Inclusion problem II.* (TK)

- Equivalence of DAGs
- Meek's conjecture and its proof in a special case
- Neighbourhood concept and covering
- Comparison of moves in graphical and CI neighbourhood

** Thursday 19 October **
* Iterative methods and exponential families *
(FM, TR)

* 1. Exponential families* (FM)

- Closure of exponential families
- Intersection of graphical models
- Iterative procedures: open problems
- Interpretation using information geometry
- Limiting properties and accumulation points

* 2. Parametrization of exponential families * (TR)

- Mixed parametrization of exponential families
- Specifications for categorical data
- Proof of convergence of an iterative procedure
- Marginal log-linear and log-affine models
- Smoothness and variation independence of parameterization
- Graphical DAG models and marginal models

** Thursday 19 October: evening **
Overview of discussion (All participants)
The aim of this session was to summarize discussion and to indentify
common research goals for (possible) future cooperation. The result
was a list of shared interests in research given in
the Appendix.

** Friday 20 October **
* Open problem session * (RB, RJ, FM, PG)
Further open problems (except those mentioned earlier) were
formulated. The participants agreed that they are going to give exact
formulation of open problems of common interest (mathematical
formulation if possible). These problems will be then put on web page
of the research kitchen in 2 or 3 months after the meeting.

Follow up

Continuing research relationships between kitcheners are expected. Specific targets include join publications on specific topics and on general methodology. An example of such a joint work is a paper about a partial solution of the inclusion problem whose writing started immediately after the kitchen. Further open questions motivated by the idea of learning chain graph models (e.g. representation of classes of equivalent chain graphs, neighbourhood characterization) are expected to be a topic of future cooperation.

** The list of participants **

**Remco Bouckaert**Crystal Mountain Information Technology*rrb@xm.co.nz***Eva-Maria Fronk**University of Munich*fronk@stat.uni-muenchen.de***Paolo Giudici**University of Pavia*giudici@unipv.it***Radim Jiroušek**University of Economics Prague*radim@vse.cz***Gernot Kleiter**University of Salzburg*gernot.kleiter@sbg.ac.at***Tomáš Kočka**University of Economics Prague*kocka@vse.cz***František Matúš**Academy of Sciences of the Czech Republic*matus@utia.cas.cz***Tamás Rudas**Eotvos University Budapest*rudas@tarki.hu***Milan Studený**Academy of Sciences of the Czech Republic*studeny@utia.cas.cz***Claudia Tarantola**University of Pavia*ctarantola@eco.unipv.it*

* Remark * The stay of Gernot Kleiter and Radim
Jirou\v{s}ek was covered from other sources.

** Appendix: common aspects and research goals **
(in alphabetic order)

- Applications in economics, social sciences and behavioural sciences, official statistics, web mining and information technology,
- Bayesian approach for simultaneous quantitative and qualitative learning,
- Computational algorithms, assessment of convergence and iterative procedures,
- Equivalence, representatives and topology of models,
- Intutition, precision and interpretation of models,
- Learning methods, exploration and model selection,
- Local specification, computation and inference,
- Moving in the model space,
- Parametrization: properties and interpretation of different parameterization,
- Representation types of the models,
- Soundness and completeness problems.