Estimating time-varying networks
Mladen Kolar Machine Learning Department Carnegie Mellon University
[email protected]
Abstract Networks are useful for representing relationships between populations of individuals, e.g., in social media networks represent connections between different actors and type of interaction that two actors have, in systems biology networks can represent complex regulatory circuitry that control cell behavior. Networks help us answer some of the fundamental questions about the system under consideration, such as: 1) Function identification, e.g., what role(s) do individuals play when they interact with different peers? 2) System robustness, e.g., how does the regulatory network rewire as a response to external stimuli? 3) Forecasting, e.g., based on the current activity, can we predict alternations of social structure (e.g., emerging or dissolving of subpopulations). Unfortunately, the network is often not directly observable, so to answer the above questions it needs to be estimated. It is not unusual for data to be large, dynamic, heterogeneous and noisy. Each of these characteristics adds a degree of complexity to the network estimation. Many of the existing approaches ignore the dynamic aspect of networks and use a simplistic approach to estimation, which provides limited insight into the processes underlying network changes. For example, it is common to infer a static regulatory network from microarray data collected over a period of time and under different conditions. I will present a line of work that deals with estimation of high-dimensional dynamic networks from limited amounts of data. The main focus is on developing semiparametric models that are very flexible in capturing dynamics of network changes and, at the same time, that are interpretable as parametric models. The models can be efficiently estimated by solving a sequence of convex optimization programs and are easily scalable to networks of tens of thousands nodes. This is particularly important for estimating regulatory networks on a whole genome scale. It can be rigorously shown under which conditions the estimated network recovers the unknown true network. So far, we have developed algorithms that recover smoothly varying networks and networks that have sudden changes in topology, e.g., as a response to external stimuli. Some promising results have been found on several real datasets. We reverse engineer the latent sequence of temporally rewiring political networks between Senators from the US Senate voting records. We estimate the latent evolving regulatory networks underlying 588 genes across the life cycle of Drosophila melanogaster from microarray time course.
References [1] M. Kolar, L. Song, E. P. Xing. Sparsistent Learning of Varying-coefficient Models with Structural Changes. Advances in Neural Information Processing Systems 23, 2009. [2] M. Kolar and E. P. Xing. Sparsistent Estimation of Time-Varying Discrete Markov Random Fields. Submitted to Annals of Statistics, June 2009. [3] L. Song, M. Kolar, and E. P. Xing. KELLER: Estimating time-evolving interactions between genes. The Sixteenth International Conference on Intelligence Systems for Molecular Biology (ISMB 2009). Bioinformatics 2009 25(12):i128-i136. [4] M. Kolar, L. Song, A. Ahmed, and E. P. Xing. Estimating time-varying networks. Submitted to Annals of Applied Statistics, 2008. 1