Workshop on Statistical Network Science

07 Jun 2018, 10:00 - 17:00

Darwin Centre, Hamilton Centre, Brunel University London

06/07/2018 10:00 AM 06/07/2018 05:00 PM Europe/London Workshop on Statistical Network Science Workshop on Statistical Network Science Darwin Centre, Hamilton Centre, Brunel University London Join our mailing list

To register for this free workshop please click here and enter the Promotional code sns when required.

10.00 Registration and Coffee

10.10 Welcome

SESSION 1

10.15 Ernst Wit (Groningen University)
Title: How scale-free are real world networks?

11.00 Sofia Massa (Oxford University)
Title: Learning stable graphical models

11:45 Elena Stanghellini (Perugia University)
Title: On Path Analysis for Bayesian Networks with Binary Nodes

12:30 Lunch

SESSION 2

13.30 Michael Stumpf (Imperial College London)
Title: From Single Cell Data to Mechanistic Insights Into Stem Cell Differentiation

14.15 Lorenz Wernisch (Biostatistics Unit, MRC)
Title: Improving protocols for cell differentiation through reinforcement learning

15.00 Coffee Break

SESSION 3

15.30 Vasiliki Koutra (Southampton University)
Title: Design of experiments and network symmetry

16.15 Tiago Peixoto (Bath University)
Title: Reconstructing networks with heterogeneous and unknown errors

17:00 Concluding remarks

ABSTRACTS

Ernst Wit, How scale-free are real world networks?

For the past 20 years, the concept of scale-free network has dominated as a powerful unifying paradigm in the study of complex systems in biology and in physical and social studies. Metabolic, protein, and gene interaction networks have been reported to exhibit scale-free behavior based on the analysis of the distribution of the number of connections of the network nodes. In 2006 Khanin and Wit suggested that in reality the associated power-law is not always empirically supported. Recently, Broido and Clauset (2018) argued something similar, which earned them the scorn from Barabasi (2018). In this talk, first, we dig into the “scale-free distribution” and discover that there are important low degree and high degree deviations from the power-law. We thus re-examine a large database of empirical degree sequences, fitting these with a wider class of distributions consistent with scale-free network structure and controlling for some subtler statistical issues not addressed in previous analyses. Using appropriate tests for deviations from “scale-free” and careful interpretation of the results, we thus give a preliminary answer to the question in the title, hopefully settling the Broido/Clauset-Barabasi argument.

Sofia Massa, Learning stable graphical models

Ongoing advances in model selection techniques for graphical models are trying to capture the structure of complex, high-dimensional datasets. Sparsity is usually invoked and techniques based on regularization, cross-validation, resampling and shrinkage estimation are becoming quite standard. One practical challenge in many applied contexts is how to assess the stability of different dependency structures and how to report the uncertainty associated with them. In this talk we will look at possible stability and uncertainty measures for undirected and chain graphs models.

Elena Stanghellini, On Path Analysis for Bayesian Networks with Binary Nodes

We investigate the relationship between the parameters of marginal and conditional logistic regression models with binary mediators when no conditional independence assumptions can be made. The interest for this study lies on several research questions. Given a data-generating process, a researcher may wish to quantify how much of the total effect of a covariate on a response is due to intermediate variables and can be removed after conditioning on their values. From a different, though related, point of view, one may wish to quantify the distortion on some regression coefficients of interest due to the omission of relevant unmeasured covariates, and use this information to build reasonable bounds or to conduct sensitivity analysis. We show how the marginal parameters decompose into the sum of terms that vanish whenever the parameters of the conditional model vanish, in parallel with the Cochran's formula for the linear case. We further show how these results can be used to extend path analysis to recursive systems of binary random variables.

Joint work with Marco Doretti

Michael Stumpf, From Single Cell Data to Mechanistic Insights Into Stem Cell Differentiation

Gene regulatory networks regulate cellular activities, but the structure of these networks is still poorly understood and must be inferred from data. Improvements in experimental technology provide us with data that are information-rich but more complex to analyze, raising a need for improved statistical techniques. One such method is multivariate information theory, which has proven itself well in recovering information from single cell data. However, information values do not indicate statistical significance, making it hard to predict networks with a given accuracy. We propose a framework to combine multivariate information with empirical Bayes to perform hypothesis testing on networks and allow network accuracy to be controlled. Furthermore, our approach allows us to incorporate prior information to improve predictions.

Lorenz Wernisch, Improving protocols for cell differentiation through reinforcement learning

Great progress has been made in generating mature cells and tissues from human pluripotent stem cells (hPSCs) for clinical applications, for example, in producing platelets for blood transfusion from megakaryocytes (MKs). Forward programming, the induction of specific transcription factors related to cell development such as GATA1, has been successfully applied to produce MKs from hPSCs. However, they retain undesirable characteristics of stem cells and their ability to produce platelets is limited. In preparation for experiments optimising the experimental protocol we explore how far reinforcement learning can guide such experiments in real time. Over a time course of several days cell markers indicating the maturity of cells can be monitored and the protocol adapted. In simulations we test reinforcement algorithms based on Gaussian process state space models and current knowledge of relevant gene regulatory networks for their ability to suggest suitable experimental interventions.

Joint work with John Reid, Cedric Ghaevert and Thomas Moreau

Vasiliki Koutra, Design of experiments and network simmetry

Designing experiments on networks challenges the traditional design approaches and classical assumptions, due to the interference among the interconnected experimental units as well as the design size. We suggest a novel algorithmic approach for obtaining efficient designs on networks within a practical time frame, by utilising the network topology and particularly its symmetries. We show that the decomposition of the graph based on its symmetries can substantially reduce the search time while maintaining the design efficiency at a sufficient level. This technique can be regarded as an essential step in the search for an optimal design on experimental units that are connected in a large network. We discuss several synthetic and real-world examples.

Tiago Peixoto, Reconstructing networks with heterogeneous and unknown errors

The vast majority of network datasets contains errors and omissions, although this is rarely incorporated in traditional network analysis. Recently, an increasing effort has been made to fill this methodological gap by developing network reconstruction approaches based on Bayesian inference. These approaches, however, rely on assumptions of uniform error rates and on direct estimations of the existence of each edge via repeated measurements, something that is currently unavailable for the majority of network data. Here we develop a Bayesian reconstruction approach that lifts these limitations by not only allowing for heterogeneous errors, but also for individual edge measurements without direct error estimates. Our approach works by coupling the inference approach with structured generative network models, which enable the correlations between edges to be used reliable error estimates. Although our approach is general, we focus on the stochastic block model as the basic generative process, from which efficient nonparametric inference can be performed, and yields a principled method to infer hierarchical community structure from noisy data. We demonstrate the efficacy of our approach in a variety of empirical and artificial networks.