Student research opportunities
Data Clouds for Large-scale Scientific Data Processing
Project Code: CECS_651
This project is available at the following levels:
CS single semester, Honours, Masters, PhD
Keywords:
Data clouds, cloud computing, scientific computing.
Supervisor:
Dr Peter StrazdinsOutline:
In 2011, the National Computing Infrastructure will host host Australia's IPCC climate data (e.g. 2 PB NetCDF files) and water resource resources data. Typical usage scenarios include climate model re-analysis and evaluation, where data requests sent to a special cloud hosting the data. Issues include different co-ordinate systems and time-skewing of data from different data sets, handling the creation of derived data and handling long delays for some data requests.
Goals of this project
This project will develop infrastructure to support data clouds in the context of large-scale scientific data processing. This might include the development of workflows, integration of data requests with Hadoop, make appropriate scheduling, and enabling (potentially) asynchronous transactions. It will also evaluate applicability of the cloud model for large-scale climate
data and simulations.
For the Honours and Masters level, particular aspects of this work will be selected.
Student Gain
This is part of a collaboration between ANU, NCI/NF and the Berueau of Meteorology. It will provide supportive infrastructure that will be used by climate and environmental scientists.
Background Literature
K. R. Jackson, L. Ramakrishnan, K. J. Runge, and R. C.
Thomas, Seeking supernovae in the clouds: a performance study,in Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, ser. HPDC ?10. New York, NY, USA: ACM, 2010, pp. 421?429.
Thakar and A. Szalay, Migrating a large science database to
the cloud, in Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, ser. HPDC ?10.
Jun Wang et al, Using Service-Based GIS to Support Earthquake Research and Disaster Response, Computing in Science & Engineering, Sept.-Oct. 2012
Links
OpenDAPCMIP5 framework
Climate Data Challenges in the 21st Century



