Student research opportunities
Modelling very large spatio-temporal environmental data CSIRO PhD top-up $15000 per year available for application
Project Code: CECS_845
This project is available at the following levels:
PhD
Please note that this project is only for higher degree (postgraduate) applicants.
Keywords:
Gaussian processes, spatio-temporal model, MCMC
Supervisor:
Dr Warren JinOutline:
Due to advancement of technology, massive data have been collected at a number of spatial locations for a considerable time period in the environmental and geophysical sciences. For example, every day in Australia, weather data are being accumulated from 3,800+ stations and hundreds of thousands of soil moisture observations are being collected. Gaussian processes are well understood and widely used by statistics and scientific communities because of their stability and relatively computational and theoretical tractability (see e.g., Cressie and Wikle 2011, Rasmussen and Williams 2006). However, for a wide range of environmental data, such as daily precipitation, pollutant concentrations, pollen, and soil moisture, Gaussian spatio-temporal models cannot reasonably be fitted to the observations.
This project will develop non-Gaussian spatio-temporal models for large data sets that may be zero inflated, skewed, and/or long-tailed. One direction is to transform a Gaussian process in a way that fits observations, with the potential use of some kind of link functions, like those in generalised linear regression. Care must be taken in the spatial prediction and/or temporal projection step as the covariance function in the transformed space is different, actually biased, from the one in the original space. Existing computational efficient approximation techniques for large spatio-temporal data sets, like fixed rank filtering or Gaussian predictive processes, may need adjustment. In addition, large scale spatial/temporal dependency should be balanced with final scale dependency.
Another direction is to assume that the spatio-temporal environmental data follow specific processes such as t-processes or Gamma processes. Challenges here will be around theoretical development of the models, appropriate covariance functions, identifiability and efficient computation.
Goals of this project
The project will develop sophisticated non-Gaussian spatio-temporal models, and perhaps associated software as well. These developed techniques are applicable to various important environmental problems such as daily precipitation projection, climate change attribution, and fusing remotely sensed soil moisture, to just name a few. It will also impact these important areas by combining sophisticated statistical modelling techniques with modern computation techniques.
Requirements/Prerequisites
- Applicants are expected to have a major in statistics/mathematics, or machine learning.
- Strong interest in environmental problems
- Preferably with strong background in statistical modelling or statistical computation.
- Preferably with excellent programming skills (R, MatLab or C/C++)
Student Gain
A student working in this project can expect
- to learn state-of-art of spatio-temporal modelling techniques
- to be involved in developing cutting-edge techniques to handle real-world environmental challenges while working with a research group delivering great science and innovative solutions for Australian society and economy;
- Supplementary PhD scholarship available from CSIRO $15000 per year for three years, subject to a separate application to CSIRO
Background Literature
- Cressie, N., T. Shi, and E. L. Kang (2010), Fixed Rank Filtering for Spatio-Temporal Data, Journal of Computational and Graphical Statistics, 19(3), 724-745, DOI 10.1198/jcgs.2010.09051;
- Gaussian Processes for Machine Learning
Carl Edward Rasmussen and Christopher K. I. Williams
MIT Press, 2006. ISBN-10 0-262-18253-X. - Porcu et al. (eds.), Advances and Challenges in Space-time Modelling of Natural Events. Springer, 2012.
Links
Co-supervisor: Dr. Phil Kokicco-supervisor: Prof. Alen Welsh

