Ingrid Glad

Department of Mathematics, University of Oslo, Norway

Hunting high and low
What is high dimensional statistics in 2016? For more than a decade, especially the field of genomics has motivated much research in so called high dimensional statistics, where the number of covariates \(p\) is much larger than the number of subjects \(n\). I will present some alterations of lasso like methods with examples from genomics and epigenomics. Furthermore, statisticians in 2016 also have to handle large, complex datasets facing high dimensional challenges other than \(p \gg n\). Paradoxically, having extremely many observations is not necessarily a blessing, but rather complicates inferential procedures and algorithms. I will briefly illustrate some of these emerging difficulties with experiences from the brand new Oslo research centre BigInsight, where data are huge and inference must be distributed in various ways, inspiring new directions for statistical research.

Arnaud Doucet

Department of Statistics, Oxford University, United Kingdom

Pseudo-marginal methods for inference in latent variable models
For complex latent variable models, the likelihood function
of the parameters of interest cannot be evaluated pointwise. In this
context, standard Markov chain Monte Carlo (MCMC) strategies used to
perform Bayesian inference can be very inefficient. Pseudo-marginal
methods are an alternative class of MCMC methods which rely on an
unbiased estimator of the likelihood. These techniques have become
popular over the past 5-10 years and have found numerous applications
in fields as diverse as econometrics, genetics and machine learning.

In the first part of the talk, I will review the standard
pseudo-marginal method, present some applications and provide useful
guidelines on how to optimize the performance of the algorithm. In the
second part of the talk, I will introduce new pseudo-marginal
algorithms which rely on novel low variance Monte Carlo estimators of
likelihood ratios. The efficiency of computations is increased
relative to the standard pseudo-marginal algorithm by several orders
of magnitude.

This is joint work with George Deligiannidis (Oxford) and Michael K. Pitt (Kings' college).

TBA

TBA

See the poster overview
for titles and abstracts.

Marloes Maathuis

Seminar für Statistik, ETH Zürich, Switzerland

The role of causal modelling in statistics
Causal questions are fundamental in all parts of science. Answering such questions from non-experimental data is notoriously difficult, but there has been a lot of recent interest and progress in this field. I will explain the fundamentals of causal modelling and outline its potential and its limitations. The concepts will be illustrated by several examples.

Arnaud Doucet

Department of Statistics, Oxford University, United Kingdom

Pseudo-marginal methods for inference in latent variable models
For complex latent variable models, the likelihood function
of the parameters of interest cannot be evaluated pointwise. In this
context, standard Markov chain Monte Carlo (MCMC) strategies used to
perform Bayesian inference can be very inefficient. Pseudo-marginal
methods are an alternative class of MCMC methods which rely on an
unbiased estimator of the likelihood. These techniques have become
popular over the past 5-10 years and have found numerous applications
in fields as diverse as econometrics, genetics and machine learning.

In the first part of the talk, I will review the standard
pseudo-marginal method, present some applications and provide useful
guidelines on how to optimize the performance of the algorithm. In the
second part of the talk, I will introduce new pseudo-marginal
algorithms which rely on novel low variance Monte Carlo estimators of
likelihood ratios. The efficiency of computations is increased
relative to the standard pseudo-marginal algorithm by several orders
of magnitude.

This is joint work with George Deligiannidis (Oxford) and Michael K. Pitt (Kings' college).

Organizer: Jukka Corander

Timo Koski.
The Minimal Hoppe-Beta Prior Distribution for Directed Acyclic
Graphs and Structure Learning

Department of Mathematics, KTH, Stockholm, Sweden

We give a new prior distribution over directed acyclic graphs
intended for structured Bayesian networks, where the structure is
given by an ordered block model. That is, the nodes of the graph are
objects which fall into categories or blocks; the blocks have a
natural ordering or ranking. The presence of a relationship between
two objects is denoted by a directed edge, from the object of category
of lower rank to the object of higher rank. Hoppe's urn scheme is
invoked to generate a random block scheme.

The prior in its simplest form has three parameters that control the
sparsity of the graph in two ways; implicitly in terms of the maximal
directed path and explicitly by controlling the edge probabilities.

We consider the situation where the nodes of the graph represent
random variables, whose joint probability distribution factorizes
along the DAG.

We use a minimal layering of the DAG to express the prior. We describe
Monte Carlo schemes, with a similar generative that was used for
prior, for finding the optimal a posteriori structure given a data
matrix.

This is joint work with John M. Noble and Felix Rios.

Håkon Tjelmeland.
Prior specification of neighbourhood and interaction structure in binary
Markov random fields

Department of Mathematical Sciences, Norwegian University
of Science and Technology, Norway

Discrete Markov random fields (MRFs) defined on a rectangular lattice
are frequently used as prior distributions in image analysis
applications. A few articles, for example the early Heikkinen and
Högmander (1994) and Higdon et al. (1997), and the more recent
Friel et al. (2009) and Everitt (2012),
have also considered a corresponding
fully Bayesian situation by assigning a hyper-prior to the parameters
of the discrete MRF. However, in these articles a fixed first-order
neighbourhood and a fixed parametric form for the MRF are assumed.

In this presentation we limit the attention to binary MRFs and discuss
the fully Bayesian setting introduced in Arnesen and Tjelmeland (2016).
We assign prior distribution to all
parts of the MRF specification. In particular we define priors for the
neighbourhood structure of the MRF, what interactions to include in the
model, and for the parameter values. We consider two parametric forms for
the energy function of the MRF, one where the parameters represent
interaction strengths and one where the parameters are potential values.
Both parameterisations have important advantages and disadvantages, and to
combine the advantages of both formulations our final prior formulation
is based on both parametrisations. The prior for the neighbourhood and
what interactions to include in the MRF is based on the parameterisation
using interaction strengths, whereas the prior for the parameter
values is based on the parameterisation where the parameters are
potential values.

We define a reversible jump Markov chain Monte
Carlo (RJMCMC) procedure to simulate from the corresponding posterior distribution when conditioned to an observed
scene. Thereby we are able to learn both the neighbourhood
structure and the parametric form of the MRF from the observed
scene. In particular we learn whether a pairwise interaction
model is sufficient to model the scene of interest, or
whether a higher-order interaction model is preferable.
We circumvent evaluations of the intractable
normalising constant of the MRF when running the RJMCMC
algorithm by adopting a previously defined approximate
auxiliary variable algorithm. We demonstrate the usefulness
of our prior in two simulation examples and one real
data example.

**References**

Arnesen and Tjelmeland (2016). Prior
specification of neighbourhood and interaction structure in binary
Markov random fields, Statistics and Computing.

Everitt, R. G. (2012). Bayesian parameter estimation for latent Markov
random fields and social networks, Journal of Computational and
Graphical Statistics, **21**, 940–960.

Friel, N. and Rue, H. (2007). Recursive computing and simulation-free
inference for general factorizable models, Biometrika,
**94**, 661–672.

Heikkinen, J. and Högmander, H. (1994). Fully Bayesian approach to
image restoration with an application in biogeography, Applied
Statistics, **43**, 569–582.

Higdon, D. M., Bowsher, J. E., Johnsen, V. E., Turkington, T. G.,
Gilland, D. R. and Jaszczak, R. J. (1997). Fully Bayesian estimation
of Gibbs hyperparameters for emission computed tomography data, IEEE
Transactions on Medical Imaging, **16**, 516–526.

Jukka Corander.
Likelihood-free inference via machine learning

Department of Mathematics and Statistics,
University of Helsinki, Finland

Likelihood-free inference, ABC and synthetic likelihood have
recently been popularized as techniques for inferring parameters in
intractable simulator-based models. In this talk we consider how
various machine learning methods can provide ways to both speed-up the
computation and to quantify the approximate likelihood in a consistent
manner. Examples from ecology and infectious disease epidemiology are
used to illustrate the use of machine learning in applications.

Organizer: Mark Podolskij

Mikkel Bennedsen.
Hybrid scheme for Brownian semistationary processes

Aarhus University, Denmark

We introduce a simulation scheme for a large class of rough
processes called Brownian semistationary processes. The scheme is
based on discretizing the stochastic integral representation of the
process in the time domain. We assume that the kernel function of the
process is regularly varying at zero. The novel feature of the scheme
is to approximate the kernel function by a power function near zero
and by a step function elsewhere. The resulting approximation of the
process is a combination of Wiener integrals of the power function and
a Riemann sum, which is why we call this method a hybrid scheme. The
scheme leads to a substantial improvement of accuracy compared to the
ordinary forward Riemann-sum scheme, while having the same
computational complexity.

This is joint work with Asger Lunde and Mikko S. Pakkanen.

Susanne Ditlevsen.
Multi-class oscillating systems of interacting neurons

Department of Mathematical Sciences, University of
Copenhagen, Denmark

We consider multi-class systems of interacting nonlinear Hawkes
processes (Hawkes, 1971) modeling several large families of neurons
and study their mean field limits. As the total number of neurons goes
to infinity we prove that the evolution within each class can be
described by a nonlinear limit differential equation driven by a
Poisson random measure, and state associated central limit theorems.
We study situations in which the limit system exhibits oscillatory
behavior, and relate the results to certain piecewise deterministic
Markov processes and their diffusion approximations.
The motivation for this paper comes from the rhythmic scratch like
network activity in the turtle, induced by a mechanical stimulus, and
recorded and analyzed by Berg and co-workers (Berg et al., 2007).
Oscillations in a spinal motoneuron are initiated by the sensory
input, and continues by some internal mechanisms for some time after
the stimulus is terminated. While mechanisms of rapid processing are
well documented in sensory systems, rhythm-generating motor circuits
in the spinal cord are poorly understood. The activation leads to an
intense synaptic bombardment of both excitatory and inhibitory input,
and it is of interest to characterize such network activity, and to
build models which can generate self-sustained oscillations.
Generally, biological rhythms are ubiquitous in living orgamisms. The
brain controls and helps maintain the internal clock for many of these
rhythms, and fundamental questions are how they arise and what is
their purpose. Many examples of such biological oscillators can be
found in the classical book by Glass and Mackey (1988).

The talk is based on the paper Ditlevsen and Löcherbach (2016).

**References**

Hawkes, A. G. (1971), Spectra of Some Self-Exciting and Mutually Exciting Point Processes. Biometrika, **58**, 83-90.

Berg, R.W., Alaburda, A., Hounsgaard, J. (2007). Balanced Inhibition and Excitation Drive Spike Activity in Spinal Half-Centers. Science **315**, 390-393.

Glass, L., Mackey, M.C. (1988). *From Clocks to Chaos: The Rhythms of Life*. Princeton University Press.

Ditlevsen, S., Löcherbach, E. (2016). Multi-class oscillating systems of interacting neurons.

Department of Mathematics, Aarhus University, Denmark

In this talk we present some new limit theorems for power
variation of of stationary increments Levy driven moving averages. In
this infill sampling setting, the asymptotic theory gives very
surprising results, which (partially) have no counterpart in the
theory of discrete moving averages. More specifically, we will show
that the first order limit theorems and the mode of convergence
strongly depend on the interplay between the given order of the
increments, the considered power, the Blumenthal-Getoor index of the
driving pure jump Levy process and the behaviour of the kernel
function near zero. First order asymptotic theory essentially
comprises three cases: stable convergence towards a certain infinitely
divisible distribution, an ergodic type limit theorem and convergence
in probability towards an integrated random process. We also prove the
second order limit theorem connected to the ergodic type result.

Richard Samworth

Statistical Laboratory, University of Cambridge, United Kingdom

Random projection ensemble classification
We introduce a very general method for high-dimensional classification, based on careful combination of the results of applying an arbitrary base classifier to random projections of the feature vectors into a lower-dimensional space. In one special case that we study in detail, the random projections are divided into non-overlapping blocks, and within each block we select the projection yielding the smallest estimate of the test error. Our random projection ensemble classifier then aggregates the results of applying the base classifier on the selected projections, with a data-driven voting threshold to determine the final assignment. Our theoretical results elucidate the effect on performance of increasing the number of projections. Moreover, under a boundary condition implied by the sufficient dimension reduction assumption, we show that the test excess risk of the random projection ensemble classifier can be controlled by terms that do not depend on the original data dimension. The classifier is also compared empirically with several other popular high-dimensional classifiers via an extensive simulation study, which reveals its excellent finite-sample performance.

Organizer: Anders Nordgaard

Silvia Bozza.
Bayesian multilevel models for forensic data

School of Criminal Justice, The University of Lausanne; Department of Economics, Ca’ Foscari University of Venice

In forensic science, statistical methods are largely used
for assessing the probative value of scientific evidence. The
evaluation of measurements on characteristics associated to trace
evidence is performed through the derivation of a Bayes factor, a
rigorous concept that provides a balanced measure of the degree to
which evidence is capable of discriminating among competing
propositions that are suggested by opposing parties at trial. The
assessment of a Bayes factor may be a demanding task, essentially
because of the complexity of the scenario at hand and the possible
poor informations at the forensic scientist's disposal. Moreover,
forensic laboratories have frequently access to equipment which can
readily provide scientific evidence in the form of multivariate data,
and available databases may be characterised by a complex dependence
structure with several levels of variation and a large number of
variables. One of the criticisms that is levelled against the use of
multivariate techniques in forensic science is the lack of background
data from which to estimate parameters and several attempts have been
proposed to achieve a dimensionality reduction. Clearly, any
statistical methodology which leads to a reduction of the multivariate
structure to fewer or even only one dimension need careful
justification in order to avoid the challenge of suppression of
evidence. Bayesian multilevel models for the evaluation of
multivariate measurements on characteristics associated to questioned
material that are capable to deal with such constraints (e.g.,
correlation between variables and multiple sources of variation) may
be proposed in various forensic domains. Numerical procedures may be
implemented to handle the complexity and to compute the marginal
likelihoods under competing propositions. This, along with the
acknowledgement of subjective evaluations that are unavoidably
involved in the Bayes factor assignment, has originated a large debate
in the forensic community about its admissibility at trial. These
ideas will be illustrated with reference to handwriting examination, a
forensic discipline that attracts nowadays considerable attention due
to its uncertain status under new admissibility standards.

Norwegian University of Life Sciences, Ås, Norway

In forensic genetics one commonly encounters biological
stains that contain DNA from several individuals, so called mixture
DNA profiles, where the goal is to identify the individual
contributors. One example is a rape case where an evidence sample
shows a mixture of the victim and the perpetrator. A further
complication is when the individuals in the mixture are also related,
because they are likely to have a more similar individual DNA profile
than unrelated individuals. Disregarding this relationship may lead to
an overestimation of the evidence against the suspect. In other cases
one may wish to determine the relationship between individuals based
on a DNA mixture. An example is prenatal paternity testing, where the
father of a child is determined based on a blood sample from the
mother that reveals a DNA mixture of mother and child. We will look at
how the weight of evidence can be estimated for DNA mixtures with
related contributors. There is a long tradition of statistics in
(forensic) genetics and the talk will present and discuss statistical
models for the mentioned applications. Stochastic models are becoming
increasingly relevant as methods are getting more sensitive and
distinguishing noise from signal more challenging.

Anders Nordgaard.
Predictive distributions of the percentages of narcotic substances in drug seizures

Swedish Police Authority and National Forensic Centre & Colin Aitken, School of Mathematics, University of Edinburgh

The percentage of the narcotic substance in a drug seizure
may vary a lot depending on when and from whom the seizure was
taken. Seizures from a typical consumer would in general show low
percentages, while seizures from the early stages of a drug dealing
chain would show higher percentages (these will be
diluted). Historical records from the determination of the percentage
of narcotic substance in seized drugs reveal that the mean percentage
but also the variation of the percentage can differ substantially
between years. Some drugs show close to monotonic trends while others
are more irregular in the temporal variation. Legal fact finders must
have an up-to-date picture of what is an expected level of the
percentage and what levels are to be treated as unusually low or
unusually high. This is important for the determination of the
sentences to be given in a drug case. In this work we treat the
probability distribution of the percentage of a narcotic substance in
a seizure from year to year as a time series of functions. The
functions are probability density functions of beta distributions,
which are successively updated with the use of point mass posteriors
for the shape parameters. The predictive distribution for a new year
is a weighted sum of beta distributions for the previous years where
the weights are found from forward validation.

Organizer: Egil Ferkingstad

Egil Ferkingstad.
Improving the INLA approach for approximate Bayesian inference for latent Gaussian models

University of Iceland, Iceland

We introduce a new copula-based correction for generalized
linear mixed models (GLMMs) within the integrated nested Laplace
approximation (INLA) approach for approximate Bayesian inference for
latent Gaussian models. While INLA is usually very accurate, some
(rather extreme) cases of GLMMs with e.g. binomial or Poisson data
have been seen to be problematic. Inaccuracies can occur when there is
a very low degree of smoothing or “borrowing strength” within the
model, and we have therefore developed a correction aiming to push the
boundaries of the applicability of INLA. Our new correction has been
implemented as part of the R-INLA package, and adds only negligible
computational cost. Empirical evaluations on both real and simulated
data indicate that the method works well.

This is joint work with Håvard Rue (NTNU).

Óli Páll Geirsson.
An MCMC split sampler for latent Gaussian models

University of Iceland, Iceland

Latent Gaussian models (LGMs) form a flexible subclass of
Bayesian hierarchical models and have become popular in many areas of
statistics and various fields of applications, as LGMs are both
practical and readily interpretable. Although LGMs are well suited
from a statistical modeling point of view their posterior inference
becomes computationally challenging when latent models are desired for
more than just the mean structure of the data density function; or
when the number of parameters associated with the latent model
increases.

We propose a novel computationally efficient Markov chain Monte Carlo
(MCMC) scheme which serves to address these computational issue, we
refer to as the MCMC split sampler. The sampling scheme is designed to
handle LGMs where latent models are imposed on more than just the mean
structure of the likelihood; to scale well in terms of computational
efficiency when the dimensions of the latent models increase; and to
be applicable for any choice of a parametric data density function.
The main novelty of the MCMC split sampler lies in how the model
parameters of a LGM are split into two blocks, such that one of the
blocks exploits the latent Gaussian structure in a natural way and
becomes invariant of the data density function.

This is joint work with Birgir Hrafnkelsson (University of Iceland),
Helgi Sigurðarson (University of Iceland) and Daniel Simpson
(University of Bath).

Tore Selland Kleppe.
Bayesian Analysis in Non-linear Non-Gaussian State-Space Models using Particle Gibbs

University of Stavanger, Norway

We consider Particle Gibbs (PG) as a tool for Bayesian
analysis of non-linear non-Gaussian state-space models. PG is a Monte
Carlo (MC) approximation of the standard Gibbs procedure which uses
sequential MC (SMC) importance sampling inside the Gibbs procedure to
update the latent and potentially high-dimensional state trajectories.
We propose to combine PG with a generic and easily implementable SMC
approach known as Particle Efficient Importance Sampling (PEIS). By
using SMC importance sampling densities which are closely globally
adapted to the targeted density of the states, PEIS can substantially
improve the mixing and the efficiency of the PG draws from the
posterior of thestates and the parameters relative to existing PG
implementations. The efficiency gains achieved by PEIS are illustrated
in PG applications to a stochastic volatility model for asset returns
and a Gaussian nonlinear local level model for interest rates.

This is joint work with Oliver Grothe (Karlsruhe Institute of Technology)
and Roman Liesenfeld (University of Cologne).

Jonathan Taylor

Department of Statistics, Stanford University, USA

Selective inference in linear regression
We consider inference after model selection in linear regression
problems, specifically after fitting the LASSO (Lee et al.). A
classical approach to this problem is data splitting, using some
randomly chosen portion of the data to choose the model and the
remaining data for inference in the form of confidence intervals and
hypothesis tests. Viewing this problem in the framework of selective
inference of (Fithian et al.), we describe a few other randomized
algorithms with similar guarantees to data splitting, at least in the
parametric setting (Tian and Taylor). Time permitting, we describe
analogous results from (Tian and Taylor) for arbitrary statistical
functionals obeying a CLT in the classical fixed dimensional setting
and inference after choosing a tuning parameter by cross-validation.

**References**

Lee et al. Exact post-selection inference, with application to the LASSO.

Fithian et al. Optimal Inference After Model Selection.

Tian and Taylor. Selective inference with a randomized response.

Organizer: Niels Richard Hansen

Niels Keiding.
Generalization from self-selected epidemiological studies

Section of Biostatistics, University of Copenhagen, Denmark

Low front-end cost and rapid accrual make web-based surveys
and enrollment in studies attractive. Participants are often
self-selected with little reference to a well-defined study base. Of
course, high quality studies must be internally valid (validity of
inferences for the sample at hand), but web-based sampling reactivates
discussion of the nature and importance of external validity
(generalization of within-study inferences to a target population or
context) in epidemiology. A classical epidemiological approach would
emphasize representativity, usually conditional on important
confounders. An alternative view held by influential epidemiologists
claims that representativity (in a narrow sense) is irrelevant for the
scientific nature of epidemiology. Against this background, it is a
good time for statisticians to take stock of our role and position
regarding surveys and observational research in epidemiology. The
central issue is whether conditional effects in the study population
may be transported to desired target populations. This will depend on
the compatibility of causal structures in study and target
populations, and will require subject matter considerations in each
concrete case. Statisticians, epidemiologists and survey researchers
should work together to develop increased understanding of these
challenges and improved tools to handle them.

**References**

Keiding, N. & Louis, T.A. (2016). Perils and potentials of self-selected entry to
epidemiological studies and surveys (with
discussion). J.Roy.Statist.Soc. A 179, 319-376.

Torben Martinussen.
Instrumental variables estimation with competing risk data

Section of Biostatistics, University of Copenhagen, Denmark

Martinussen et al. (2015) used semiparametric structural
cumulative failure time model and instrumental variables (IV) to
estimate causal exposure effects for survival data. They impose no
restrictions on the type of the instrument nor on the exposure
variable. Furthermore their method allows for nonparametric estimation
of possible time changing exposure effect. In this work we extend the
methods of Martinussen et al. (2015) to handle competing risk
data. Such data are very common in practice when studying the timing
of initiation of a specific disease since death will often be a
competing event. Also when studying death due to a specific cause,
such as death from breast cancer as was of interest in the HIP-study,
death from any other cause is a competing event. The HIP- study
comprises approximately 60000 women and in the first 10 years of
follow-up there are 4221 deaths, but only 340 were deemed due to
breast cancer. Hence, competing risks is a major issue in these
data. Due to non-compliance it is not straightforward to estimate the
screening effect. Randomization can, however, be used as an IV and,
hence, for these data it is pertinent to have IV-methods for competing
risk data to learn about the causal effect of breast cancer screening
on the risk of dying from breast cancer.

This is joint work with Stijn Vansteelandt (Ghent University).

**References**

Martinussen, T., ... (2015).

Organizer: David Bolin

University of Bath, United Kingdom

The EUSTACE project will give publicly available daily
estimates of surface air temperature since 1850 across the globe for
the first time by combining surface and satellite data using novel
statistical techniques. Designing and estimating a realistic
stochastic model that can realistically capture the multiscale
statistical behaviour of air temperature across a wide range of
time-scales is not only a modelling challenge, but also a
computational challenge. Existing methods for spatial statistics need
to be scaled up to handle a large quantity of non-Gaussian data, as
well as to properly quantify the uncertainty of the temperature
reconstructions in regions and time periods with small quantities of
data.

Geir-Arne Fuglstad.
A New Prior that Penalises the Complexity of Stationary and Non-stationary Spatial Fields

Norwegian University of Science and Technology, Norway

Gaussian random fields (GRFs) are important building blocks
in hierarchical models for spatial data, but their parameters
typically cannot be consistently estimated under in-fill asymptotics.
Even for stationary Matérn GRFs, the posteriors for range and marginal
variance do not contract and for non-stationary models there is a high
risk of overfitting the data. But, despite this, there is no
practically useful, principled approach for selecting the prior on
their parameters, and the prior typically must be chosen in an ad-hoc
manner.

We propose to construct priors such that simpler models are preferred,
i.e. shrinking stationary GRFs towards infinite range and no effect,
and shrinking non-stationary GRFs towards stationary GRFs. We use the
recent Penalised Complexity prior framework to construct a practically
useful, tunable, weakly informative joint prior on the range and the
marginal variance for a Matérn GRF with fixed smoothness, and then
extend the prior to non-stationary controlled by covariates in the
covariance structure. We apply the priors to a dataset of annual
precipitation in southern Norway and show that the scheme for
selecting the hyperparameters of the non-stationary extension leads to
improved predictive performance over the stationary model.

David Bolin.
Multivariate non-Gaussian Matérn fields

University of Gothenburg, Sweden

Developing models for multivariate spatial data has been an
active research area in recent years. However, most research has
focused on Gaussian models and there are few practically useful
methods for multivariate non-Gaussian geostatistical data. We present
a new class of multivariate Matérn random fields, constructed as
solutions to systems of stochastic partial differential equations
driven by generalized hyperbolic noise. The fields have flexible
marginal distributions and are suitable for problems where Gaussianity
cannot be assumed for one or more of the dimensions in the data. The
model parameters can be estimated efficiently using a likelihood-based
method, also when the fields are incorporated in a geostatistical
setting with irregularly spaced observations, measurement errors, and
covariates. Finally, a comparison with standard Gaussian models is
presented for an application to precipitation data.

Organizer: Kjetil Røysland

Jon Michael Gran.
Causal inference in multi-state models using the G-formula: application to data on sickness absence and work

Department of Biostatistics, University of Oslo, Norway

Multi-state models, as an extension of traditional models in
survival analysis, have proved to be a flexible framework for
analysing transitions between various states of sickness absence and
work over time using data from national registries. A main aim in
sickness absence research is to identify the effects of possible
interventions, e.g. with respect to increased work participation. When
data on the important confounders are available, either through the
same registry data or through linkage with other data sources, we have
suggested using methods based on inverse probability weighting or
g-computation for identifying such effects.

G-computation, or applying the G-formula, is a very flexible approach
for identifying marginal treatment effects in multi-state models. It
is closely related to traditional ways of making inference from these
types of models, but can also be extended to cover a wide set of
exposures and confounding situations, such as more intricate treatment
regimes and time-dependent confounding. In this talk we will discuss
two ways of performing g-computation in multi-state models; one based
on intervening on transition intensities and one based on intervening
on additional covariates, and how these can be equivalent given
certain specifications of the multi-state models. We will discuss pros
and cons of applying the G-formula in multi-state models and how the
simple implementations can be extended to address more advanced causal
questions.

The methods will be illustrated using Norwegian population-wide
registry data on sickness absence, disability and work participation,
coupled with data from other registries.

Mats Julius Stensrud.
Assessing paradoxes in medical studies: Do not forget the frailty

Department of Biostatistics, University of Oslo, Norway

A wide range of associations in medicine are claimed to be
paradoxical. Many of these associations, however, may have plausible
explanations. Causal diagrams are often used to argue that
counter-intuitive associations are examples of selection bias. These
diagrams do not generally allow to explore the direction and magnitude
of the bias. For real life analyses, a numeric evaluation of the bias
may be essential.

By combining causal DAGs and quantitative frailty models, we improve
the understanding of counter-intuitive associations in epidemiology.
First, we consider treatments that are examined over time, and we
point to a time-dependent Simpson's paradox. For example, we show that
a treatment with constant effect can appear beneficial at time \(t=0\),
but harmful at \(t>0\). Then, we explore a competing risks setting,
where being at increased risk of one event may falsely reduce the risk
of another event. Finally, we reveal spurious effects that appear in
studies of a diseased population (index-event studies). In particular,
analyses estimating the effect of a risk factor (e.g. obesity) on an
outcome (e.g. mortality) in a population with a chronic disease (e.g.
kidney failure), will be prone to the index-event bias.

Our examples show that frailty will lead to bias in common
medical scenarios. It is important that applied researchers recognize
these spurious associations. A numerical evaluation of the frailty
bias will often be appropriate.

Kjetil Røysland.
Causal local independence models

Department of Biostatistics, University of Oslo, Norway

Survival analysis has become one of the fundamental fields
of biostatistics. Such analyses are almost always subject to
censoring. This necessitates special statistical techniques and forces
statisticians to think more in terms of stochastic processes. The
theory of stochastic integrals and martingales have therefore been
important for the development of such techniques.

Causal inference has lately had a huge impact on how statistical
analyses based on non-experimental data are done. The idea is to use
data from a non-experimental scenario that could be subject to several
spurious effects and then fit a model that would govern the
frequencies we would have seen in a related hypothetical scenario
where the spurious effects are eliminated. This opens up for using the
Nordic health registries to answer new and more ambitious questions.
However, there has not been so much focus on causal inference based
time-to-event data or survival analysis.

The now well established theory of causal Bayesian networks is for
instance not suitable for handling such processes. Motivated by causal
inference event-history data from the health registries, we introduce
causal local independence models. We show that they offer a
generalization of causal Bayesian networks that also enables us to
carry out causal inference based on non-experimental data when there
is continuous-time processes involved.

The main purpose of this work is to provide new tools for determining
the identifiability of causal effects in a dynamic context. We provide
criteria based on local independence graphs for identifiability of
causal effects. Typically, one can develop graphical criteria for when
unmeasured processes disturb a statistical analysis, or when these can
safely be ignored. This is done by combining previous work on local
independence graphs and \(\delta\)-separation by Vanessa Didelez and
previous work on causal inference for counting processes by Kjetil
Røysland.

Organizer: Henrik Madsen

Bjarne Ersbøll.
Smart Cities, Smart Societies, and Smart Countrysides

Department of Applied Mathematics and Computer Science, DTU, Lyngby, Denmark

Volume, Velocity, Variety,
Veracity. These terms are commonly heard when one
hears about Big Data. However, most of the
participants at Nordstat will not feel
uncomfortable by hearing these terms. On the
contrary, we are trained to tackle these kinds of
problems, so why the big fuss? During this talk I
will try to convene experience gained through a
sector development project targeted at addressing
a societal challenge. Other facets of Big Data
pose challenges which necessitate involvement of
other competences and skills than those of
statistics and data analysis.

Henrik Madsen.
Using big data analytics for enabling intelligent and integrated energy systems in Smart Cities

Department of Applied Mathematics and Computer Science, DTU, Lyngby, Denmark

This presentation will briefly present the some of the results and
methods obtained within the DSF (Danish Council for Strategic
Research) Center for IT-Intelligent Energy Systems in cities (CITIES).
Using big data analytics and methods for stochastic optimization -
including stochastic control - the scientific objective of CITIES is
to develop methodologies and ICT solutions for the analysis, operation
and development of fully integrated urban energy systems. A holistic
research approach will aim at providing solutions at all levels
between the households and the global energy system at all essential
temporal and spatial scales. The societal objective is to harvest the
power of big data analytics to identify and establish realistic
pathways to ultimately achieving independence of fossil fuels by
harnessing the latent flexibility of the energy system through big
data analytics, IT-intelligence, integration and planning.

Organizer: Nils Lid Hjort

Bo H. Lindqvist.
Conditional, posterior and fiducial sampling

Department of Mathematical Sciences, Norwegian University of Science and Technology, Trondheim

The point of departure of the talk is an algorithm for
sampling from conditional distributions given a sufficient statistic.
In certain cases this can be done by a simple parameter adjustment of
the original statistical model [1], but in general one needs to use a
weighted sampling scheme [2]. The trick is to introduce a distribution
on the parameter space, where the use of improper distributions is
seen to have some advantages. As will be demonstrated, the approach is
closely related to the problem of sampling from posterior
distributions, and is also connected to fiducial inference [4, 5].
Particular emphasis will be given to the role of improper
distributions, where a theoretical framework that includes improper
laws will be briefly reviewed [3, 6].

This is joint work with Gunnar Taraldsen.

**References**

[1] Engen, S., & Lillegård, M. (1997). Stochastic simulations conditioned on sufficient statistics. Biometrika, 84(1), 235-240.

[2] Lindqvist, B. H., & Taraldsen, G. (2005). Monte Carlo conditioning on a sufficient statistic. Biometrika, 92(2), 451-464.

[3] Taraldsen, G., & Lindqvist, B. H. (2010). Improper priors are not improper. The American Statistician, 64(2), 154-158.

[4] Taraldsen, G., & Lindqvist, B. H. (2013). Fiducial theory and optimal inference. The Annals of Statistics, 41(1), 323-341.

[5] Taraldsen, G., & Lindqvist, B. H. (2015). Fiducial and posterior sampling. Communications in Statistics - Theory and Methods, 44(17), 3754-3767.

[6] Taraldsen, G., & Lindqvist, B. H. (2016). Conditional probability and improper priors. Communications in Statistics - Theory and Methods, (to appear).

Department of Mathematics, University of Oslo, Norway

I introduce a method for melding together
the classic parametric likelihood with the
nonparametric empirical likelihood, and work
out theory for such schemes. The method is
really a class of methods, as the statistician
would need to choose both which extra
parameters to include in the construction
and a certain balance parameter that
weighs parametrics against nonparametrics.
I will show how the methodology of focused
information criteria may be used to aid
in these choices.

Organizer: Sara Sjöstedt de Luna

James O. Ramsay.
Exploring Functional Data with Dynamic Smoothing

Department of Psychology, McGill University, Montreal, Canada

Discrete observations of curves are often smoothed by
attaching a penalty to the error sum of squares, and the most popular
penalty is the integrated squared second derivative of the function
that fits the data. But it has been known since the earliest days of
smoothing splines that, if the linear differential operator \(D^2\) is
replaced by a more general differential operator \(L\) that annihilates
most of the variation in the observed curves, then the resulting
smooth has less bias and greatly reduced mean squared error.

This talk will show how we can use the data to estimate such a linear
differential operator for a system of one or more variables. The
differential equations estimated in this way represents the dynamics
of the processes being estimated. This idea can be used to estimate a
forcing function that defines the output of a linear system, and apply
this to handwriting data to show that both the static and dynamic
aspects of handwriting are well represented by a surprisingly simple
second order differential equation.

Piercesare Secchi.
Random domain decompositions for Object Oriented Spatial Statistics

MOX, Department of Mathematics, Politecnico di Milano, Italy

Object Oriented Spatial Statistics (O2S2) adresses a variety
of application-oriented statistical challenges where the atoms of the
analysis are complex data points spatially distributed. The object
oriented viewpoint consists in considering as building block of the
analysis the whole data point, whether it is a curve, a distribution
or a positive definite matrix, regardless of its complexity. When data
are observed over a spatial domain, an extra layer of complexity
derives from the size, the shape or the texture of the domain, posing
a challenge related to the impossibility, both theoretical and
practical, of employing approaches based on global models for
capturing spatial dependence. A powerful non-parametric line of action
is obtained by splitting the analysis along an arrangement of
neighborhoods generated by a random decomposition of the spatial
domain. The local analyses produce auxiliary new data points which are
then aggregated to generate the global final result. I will illustrate
these ideas with a few examples where the target analysis is
dimensional reduction, classification or prediction.

The talk is based on discussions and work developed at MOX, Department
of Mathematics, Politecnico di Milano, with Alessandra Menafoglio,
Simone Vantini and Valeria Vitelli (the latter is now at the Oslo
Center for Biostatistics and Epidemiology, Department of
Biostatistics, University of Oslo) and with Konrad Abramowicz, Per
Arnqvist and Sara Sjöstedt de Luna at the Department of
Mathematics and Mathematical Statistics, Umeå University.

**References**

Abramowicz K., Arnqvist P., Secchi P., Sjöstedt de Luna S., Vantini
S., Vitelli V. Clustering misaligned dependent curves -
applied to varved lake sediment for climate reconstruction,
*Manuscript*, 2015.

Menafoglio A. and Secchi P. Statistical analysis of complex and spatially dependent
data: a review of Object Oriented Spatial Statistics, *Manuscript*, 2016.

Secchi, P., Vantini, S., and Vitelli, V. Bagging Voronoi classifiers
for clustering spatial functional data. *International Journal
of Applied Earth Observation and Geoinformation*, **22**, 53-64, 2012.

Secchi, P., Vantini, S., and Vitelli, V. Analysis of
spatio-temporal mobile phone data: a case study in the metropolitan
area of Milan (with discussion). *Statistical Methods and
Applications*, **24**(2), 279-300, 2015.

Organizer: Carsten Wiuf

School of Mathematics & Statistics, University of Newcastle, United Kingdom

Performing inference for the parameters governing the Markov
jump process (MJP) representation of a stochastic kinetic model, using
data that may be incomplete and subject to measurement error, is a
challenging problem. Since transition probabilities are intractable
for most processes of interest yet forward simulation is
straightforward, fully Bayesian inference typically proceeds through a
"likelihood-free" particle MCMC (pMCMC) scheme. In this talk, we
describe a recently proposed approach that exploits the tractability
of an approximation to the MJP to reduce the computational cost of the
algorithm, whilst still targeting the correct posterior. We also
demonstrate that it is possible to improve the statistical efficiency
of the vanilla implementation by replacing draws from the forward
simulator with those obtained from an approximation to the MJP,
conditioned on the observations. We illustrate each approach using toy
models of gene expression and predator-prey dynamics.

Sach Mukherjee.
Causal discovery in molecular biology

German Center for Neurodegenerative Diseases, Bonn, Germany

Networks play a central conceptual role in molecular
biology. Molecular networks are often conceived of as encoding causal
influences. Then, the task of estimating such networks from data is
effectively one of causal discovery. In recent years there has been
much innovative methodological work in this area. However, our
understanding of empirical performance remains limited in important
ways and it remains unclear whether estimation of causal molecular
networks is really effective in practice, particularly in relatively
complex biomedical settings. I will discuss some aspects of estimation
of causal networks in molecular biology, with particular emphasis on
the empirical assessment of candidate estimators.

Organizer: Mogens Bladt

Asger Hobolth.
Inferring population history from DNA sequences: Statistical
methods, models and challenges.

Center for Bioinformatics, Aarhus University, Denmark

In this talk I describe statistical methods and recent
results for inferring variability in population size in humans and
other species. Variability in population size has traditionally been
inferred from the site frequency spectrum, but in the past few years
methods based on complete genome sequences have been developed. These
new methods are based on a Markov approximation along the sequences of
the ancestral process with mutation and recombination. The
approximation means that the ancestral process simplifies to a state
space model, and we use particle filtering to carry out statistical
inference. I also describe how to perform model checking, and draw
connections between various mutation pattern summaries and methods
from spatial statistics.

Michael Sørensen.
Bridge Simulation for Multivariate Stochastic Differential Equations

Deparment of Mathematical Sciences, University of Copenhagen, Denmark

New simple methods of simulating multivariate diffusion bridges are
presented. Diffusion bridge simulation plays a fundamental role in
simulation-based likelihood inference for stochastic differential
equations. By a novel application of classical coupling methods, the
new approach generalizes the one-dimensional bridge-simulation method
proposed by Bladt and Sørensen (2014) to the multivariate setting. A
method of simulating approximate, but often very accurate, diffusion
bridges is proposed. These approximate bridges are used as proposal
for easily implementable MCMC algorithms that produce exact diffusion
bridges. The new method is more generally applicable than previous
methods because it does not require the existence of a Lamperti
transformation, which rarely exists for multivariate
diffusions. Another advantage is that the new method works well for
diffusion bridges in long intervals because the computational
complexity of the method is linear in the length of the interval. The
lecture is based on joint work presented in Bladt, Finch and Sørensen
(2016).

The lecture is based on joint work with M. Bladt and S. Finch.

**References**

Bladt, M. and Sørensen, M. (2014). Simple simulation of diffusion bridges with application to likelihood inference for diffusions. *Bernoulli*, 20, 645-675.

Bladt, M., Finch, S. and Sørensen, M. (2016). Simulation of multivariate diffusion bridges. *J. Roy. Statist. Soc. B*, 78, 343-369.

Mogens Bladt.
Phase–type distributions and heavy tails

Department of Mathematical Sciences, University of
Copenhagen, Denmark

In this talk we propose a class of genuinely heavy-tailed
distributions which are mathematically tractable in the sense that we
can obtain either closed form formulas and/or exact solutions in
applications.
The class, *NPH*, is based on infinite-dimensional phase-type
distributions with finitely many parameters. Though the class of
finite-dimensional phase-type distributions is dense in the class of
distributions on the positive reals, and may hence approximate any
such distribution, a finite dimensional approximation will always be
light–tailed. This may be a problem when the functionals of interest
are tail dependent such as e.g. a ruin probability.

A characteristic feature of distributions from *NPH* is that the
formulas from finite–dimensional phase–type theory remain valid even
in the infinite dimensional setting. The numerical evaluation of the
infinite– dimensional formulas however differ from the
finite-dimensional theory, and we shall provide algorithms for the
numerical calculation of certain functionals of interest, such as e.g.
the renewal density and a ruin probability.

We present an example from risk theory where we compare ruin
probabilities for a classical risk process with Pareto distributed
claim sizes to the ones obtained by approximating the Pareto
distribution by an infinite–dimensional hyper–exponential
distribution.

Holger Rootzén

Mathematical Sciences, Chalmers and Gothenburg University, Sweden

Extreme value statistics: from one dimension to many
Extreme value statistics helps protect us from devastating
waves, floods, windstorms, and landslides. It is widely used for risk
management in finance and insurance, and contributes to material
science, bioinformatics, medicine, and traffic safety.

The first talk
introduces the well-established and widely used statistical theory for
extremes of one-dimensional variables. Topics include the block maxima
and peaks over thresholds methods; asymptotic (tail) independence;
threshold choice; maximum likelihood methods; and model
diagnostics. The theory is illustrated by examples from climate change
statistics, insurance, metal fatigue, gene estimation, and driver
inattention.

In the second talk I survey some of the intensive
research in multivariate extreme value statistics which happens right
now. Multivariate block maxima methods have so far seen the most
development, and the methods have already been directed at important
societal problems from hydrology. However, in more than one dimension,
block maxima hide information of whether extremes occur at the same
time or not, and likelihoods often become unwieldy in dimensions
higher than 3 or 4. Instead peaks over threshold methods keep track of
whether extremes occur at the same time or not. The last part of the
talk surveys work in progress on new parametric multivariate
generalized Pareto models. These models, perhaps surprisingly, have
simple and tractable likelihoods, and permit use of the entire
standard maximum likelihood machinery for estimation, testing, and
model checking. I will show how the models can contribute to wind
storm insurance, financial portfolio selection, and landslide risk
assessment. Throughout, an important issue is how estimated risk
should be presented and understood.

Holger Rootzén

Mathematical Sciences, Chalmers and Gothenburg University, Sweden

Extreme value statistics: from one dimension to many
Extreme value statistics helps protect us from devastating
waves, floods, windstorms, and landslides. It is widely used for risk
management in finance and insurance, and contributes to material
science, bioinformatics, medicine, and traffic safety.

The first talk
introduces the well-established and widely used statistical theory for
extremes of one-dimensional variables. Topics include the block maxima
and peaks over thresholds methods; asymptotic (tail) independence;
threshold choice; maximum likelihood methods; and model
diagnostics. The theory is illustrated by examples from climate change
statistics, insurance, metal fatigue, gene estimation, and driver
inattention.

In the second talk I survey some of the intensive
research in multivariate extreme value statistics which happens right
now. Multivariate block maxima methods have so far seen the most
development, and the methods have already been directed at important
societal problems from hydrology. However, in more than one dimension,
block maxima hide information of whether extremes occur at the same
time or not, and likelihoods often become unwieldy in dimensions
higher than 3 or 4. Instead peaks over threshold methods keep track of
whether extremes occur at the same time or not. The last part of the
talk surveys work in progress on new parametric multivariate
generalized Pareto models. These models, perhaps surprisingly, have
simple and tractable likelihoods, and permit use of the entire
standard maximum likelihood machinery for estimation, testing, and
model checking. I will show how the models can contribute to wind
storm insurance, financial portfolio selection, and landslide risk
assessment. Throughout, an important issue is how estimated risk
should be presented and understood.

Organizer: Jimmy Olsson

Randal Douc.
Folding Markov chains

Département CITI, Telecom SudParis, France

In this paper, we consider the implications of "folding" a
Markov chain Monte Carlo algorithm, namely to restrict simulation of a
given target distribution to a convex subset of the support of the
target via a projection on this subset. We argue that this
modification should be implemented in every case an MCMC algorithm is
considered. In particular, we demonstrate improvements in the
acceptance rate and in the ergodicity of the resulting Markov
chain. We illustrate those improvement on several examples, but insist
on the fact that they are universally applicable at a negligible
computing cost, independently of the dimension of the problem.

Joint work with Christian Robert (Université Paris-Dauphine).

Nick Whiteley.
Fluctuations, stability and instability of a distributed particle filter with local exchange

School of Mathematics, Bristol, United Kingdom

We study a distributed particle filter proposed by Bolic et
al. (IEEE Trans. Sig. Proc. 2005). This algorithm involves \(m\) groups of
\(M\) particles, with interaction between groups occurring through a
"local exchange" mechanism. We establish a central limit theorem in
the regime where \(M\) is fixed and \(m\) grows. A formula we obtain for the
asymptotic variance can be interpreted in terms of colliding Markov
chains, enabling analytic and numerical evaluations of how the
asymptotic variance behaves over time, with comparison to a benchmark
algorithm consisting of \(m\) independent particle filters. Subject to
regularity conditions, when m is fixed both algorithms converge
time-uniformly at rate \(M^{−1/2}\). Through use of our asymptotic variance
formula we give counter-examples satisfying the same regularity
conditions to show that when \(M\) is fixed neither algorithm, in general,
converges time-uniformly at rate \(m^{−1/2}\).

This is joint work with Kari Heine (UCL).

**References**

http://arxiv.org/abs/1505.02390

Department of mathematics, KTH, Stockholm, Sweden

We shall discuss a sequential Monte Carlo-based approach to
approximation of probability distributions defined on spaces of
decomposable graphs, or, more generally, spaces of junction (clique)
trees associated with such graphs. In particular, we apply a particle
Gibbs version of the algorithm to Bayesian structure learning in
decomposable graphical models, where the target distribution is a
junction tree posterior distribution. Moreover, we use the proposed
algorithm for exploring certain fundamental properties of decomposable
graphs, e.g., clique size distributions. Our approach requires the
design of a family of proposal kernels, so-called junction tree
expanders, expanding a given junction tree by connecting randomly a
new node to the underlying graph. The performance of the estimators is
illustrated through a collection of numerical examples demonstrating
the feasibility of the suggested approach in high-dimensional domains.

This is joint work with Tatjana Pavlenko and Felix Rios (KTH).

Organizer: Ziad Taib

Department of Statistics, Stockholm University, Sweden

The sample size of a clinical trial is often determined
based on power to show a statistically significant effect versus
control. The conventional significance level of 5% is usually used. In
this traditional approach, the sample size as well as the rule to
reject the non-effect null hypothesis do not depend on the size of the
population having the disease.

We are interested in a target population for the trial with a rare
disease where not enough patients exist to conduct a trial of
traditional size. We discuss how we alternatively can justify sample
size for such a population based on a decision theoretic approach. The
sample size based on this approach depends on the population size. Our
method is applied to real disease cases. We discuss then potential
justifications for significance levels.

The talk is based on a joint work together with Simon Day, Siew Wan
Hee, Jason Madan, Martin Posch, Nigel Stallard, Mårten Vågerö and
Sarah Zohar for the InSPiRe project, and a joint work together with
Carl Fredrik Burman for the IDEAL project. These projects have
received funding from the European Union's Seventh Framework Programme
for research, technological development and demonstration under grant
agreement no 602144 and no 602552.

AstraZeneca, Sweden

Mid-study design modifications are becoming increasingly
accepted in confirmatory clinical trials, so long as appropriate
methods are applied such that error rates are controlled. It is
therefore unfortunate that the important case of time-to-event
endpoints is not easily handled by the standard theory. We analyze
current methods that allow design modifications to be based on the
full interim data, i.e., not only the observed event times but also
secondary endpoint and safety data from patients who are yet to have
an event. We show that the final test statistic may ignore a
substantial subset of the observed event times, and that this leads to
inefficiency compared to alternative sample size re-estimation
strategies.

Ziad Taib and Mahdi Hashemi.
Interim Analyses: the more the merrier?

Early Clinical Biometrics, Astrazeneca, Sweden

Continuous administrative interim looks at data during the
course of a clinical trial are sometimes suggested as a tool for
decision making. Sometimes it is even suggested that being of an
administrative nature, there is no need for statistical formalism
since there is no intention of modifying the ongoing trial. In other
circumstances, interim analyses are proposed as part of futility
analyses in clinical trials with group sequential designs. We discuss
such interim analyses and argue that the need for rigorous statistical
methods is no different in administrative analyses than in other types
of interim analyses. Moreover, we argue that, quite often, increasing
the number of interim looks leads to higher risk and increased cost
compared to e.g. one single interim analysis.

This presentation is based on joint work with Mahdi Hashemi and Magnus
Kjaer from Astrazeneca RD, Gothenburg, Sweden.

David Lando

Center for Financial Frictions, Copenhagen Business School, Denmark

Models in bank regulation
The core of banking regulation seeks to limit the risk that banks default. Setting limits involves quantifying the risks that banks take when they give loans, hold securities and trade in derivatives, and when they choose how to fund their activities. The models used can be extremely simple or highly complex. Using some basic examples I will highlight the important trade-offs that the regulators face between simplicity and transparency on one side and realism and flexibility on the other. I will also discuss how regulatory rules may affect market prices and have profound effects on bank behavior sometimes creating perverse incentives.