# Data Science Seminars

## Interpretable Comparison of Generative Models

**Speaker**: Wittawat Jitkrittum (Research Scientist at Google Research)

**Additional information**

Given two generative models (e.g., two GAN models), and a set of target observations (e.g., real images), how do we know which model is better? In this talk, I will introduce recently developed kernel-based distance measures that will help us answer this question. These measures can be used to construct a nonparametric, computationally efficient statistical test to systematically measure the relative goodness of fit of the two candidate models. As a unique advantage, the test can produce a set of examples showing where one model fits significantly better than the other. No deep background knowledge on kernel methods or statistical testing is needed for this talk. All prerequisites will be introduced.

**Time and Place**: Thursday 5th November 2020 at 3.00pm on Zoom (contact Motonobu Kanagawa for the link)

**Slides**: Slides of the talk are available here.

### Previous talks

#### Explaining the Explainer: A First Theoretical Analysis of LIME

**Speaker**: Damien Garreau (Assistant Professor at the University Cote d’Azur)

**Abstract**

Machine learning is used more and more often for sensitive applications, sometimes replacing humans in critical decision-making processes. As such, interpretability of these algorithms is a pressing need. One popular algorithm to provide interpretability is LIME (Local Interpretable Model-Agnostic Explanation). In this talk, I will present a first theoretical analysis of LIME. In particular, we derived closed-form expressions for the coefficients of the interpretable model when the function to explain is linear. The good news is that these coefficients are proportional to the gradient of the function to explain: LIME indeed discovers meaningful features. However, our analysis also reveals that poor choices of parameters can lead LIME to miss important features.

#### Variable Prioritization in Nonlinear Black Box Methods, with application in Genomics and to Interpreting Deep Neural Network

**Speaker**: Seth Flaxmann (Lecturer in the Statistics at Department of Mathematics of Imperial College London)

**Abstract**

I will present two recent papers (https://arxiv.org/abs/1801.07318 and https://arxiv.org/abs/1901.09839) describing our work on developing new methods to interpret nonlinear Bayesian machine learning models. In the first paper, we address variable selection questions in nonlinear and nonparametric regression. Motivated by statistical genetics, where nonlinear interactions are of particular interest, we introduce a novel and interpretable way to summarize the relative importance of predictor variables. Methodologically, we develop the “RelATive cEntrality” (RATE) measure to prioritize candidate genetic variables that are not just marginally important, but whose associations also stem from significant covarying relationships with other variants in the data. We illustrate RATE through Gaussian process regression, but the methodological innovations apply to other “black box” methods. In the second paper, we extend these methods to deep neural networks (DNNs) and computer vision. DNNs are successful across a variety of domains, yet our ability to explain and interpret these methods is limited. We propose an effect size analogue for DNNs that is appropriate for applications with highly collinear predictors (ubiquitous in computer vision).

#### Learning on Aggregate Outputs with Kernels

**Speaker**:
Dino Sejdinovic (Associate Professor at the Department of Statistics, University of Oxford)

**Abstract**

Learning on Aggregate Outputs with Kernels. While a typical supervised learning framework assumes that the inputs and the outputs are measured at the same levels of granularity, many applications, including global mapping of disease, only have access to outputs at a much coarser level than that of the inputs. Aggregation of outputs makes generalization to new inputs much more difficult. We consider an approach to this problem based on variational learning with a model of output aggregation and Gaussian processes, where aggregation leads to intractability of the standard evidence lower bounds. We propose new bounds and tractable approximations, leading to improved prediction accuracy and scalability to large datasets, while explicitly taking uncertainty into account. We develop a framework which extends to several types of likelihoods, including the Poisson model for aggregated count data. We apply our framework to a challenging and important problem, the fine-scale spatial modelling of malaria incidences. Joint work with Leon Law, Ewan Cameron, Tim CD Lucas, Seth Flaxman, Katherine Battle, and Kenji Fukumizu. Biography: Dino Sejdinovic is an Associate Professor at the Department of Statistics, University of Oxford, a Fellow of Mansfield College, Oxford, and a Turing Fellow of the Alan Turing Institute. He previously held postdoctoral positions at the Gatsby Computational Neuroscience Unit, University College London (2011-2014) and at the Institute for Statistical Science, University of Bristol (2009-2011) and worked as a data science consultant in the financial services industry. He received a PhD in Electrical and Electronic Engineering from the University of Bristol (2009) and a Diplom in Mathematics and Theoretical Computer Science from the University of Sarajevo (2006).

#### Sparse Approximate Inference for Spatio-Temporal Point Process Models with Application to Armed Conflict

**Speaker**:
Andrew Zammit Mangion
(Senior Research Fellow at the University of Wollongong - NIASRA, Australia)

**Abstract**

Spatio-temporal log-Gaussian Cox process models play a central role in the analysis of spatially distributed systems in several disciplines. Yet, scalable inference remains computationally challenging both due to the high resolution modelling generally required and the analytically intractable likelihood function. In this talk I will demonstrate a novel way for solving this problem, which involves combining ideas from variational Bayes, message passing on factor graphs, expectation propagation, and sparse-matrix optimisation. The proposed algorithm is seen to scale well with the state dimension and the length of the temporal horizon with moderate loss in distributional accuracy. It hence provides a flexible and faster alternative to both non-linear filtering-smoothing type algorithms and approaches that implement the Laplace method (such as INLA) on (block) sparse latent Gaussian models. I demonstrate its implementation on simulation studies point-process observations, and use it to describe micro-dynamics in armed conflict in Afghanistan using data from the WikiLeaks Afghan War Diary. This work was done in collaboration with Botond Cseke (Microsoft Research), Guido Sanguinetti (University of Edinburgh), and Tom Heskes (University of Nijmegen). Bio: Andrew Zammit Mangion is a Senior Research Fellow at the National Institute for Applied Statistics Research Australia (NIASRA) at the University of Wollongong, Australia. His research focuses on spatial and spatio-temporal modelling of environmental phenomena, and the computational tools that enable it. He has recently co-authored a book on spatio-temporal modelling (https://www.crcpress.com/Spatio-Temporal-Statistics-with-R/Wikle-Zammit-Mangion-Cressie/p/book/9781138711136), and in 2017 was awarded a Discovery Early Career Research Award by the Australian Research Council.