Metallic nanoparticles are used as catalysts in a range of energy applications, and are the focus of intense research to lower the cost of energy production. The structure of the individual nanoparticles (only millions of a millimetre in size) determines the efficiency, and manufacturers are keen to understand which aspects of the nanoparticle structure could be used to target different applications. There are many ways to describe the structure of a nanoparticle, which make the development of predictive machine learning models challenging. While some work has been done assessing the suitability of different descriptors used for regression, little has been done to understand how unsupervised clustering and supervised classification depends on the choice of descriptors, which creates uncertainty. In this project you explore the use of clustering methods to separate different types of palladium nanoparticles using a set of different descriptors, with more of less comprehensive information. You then develop a classifier that can be used to assign unseen structures, based the different descriptors and labels generated using the chosen clustering algorithm, that can be used to guide the development and manufacturing of these materials. The data sets will be provided.
Determine how the choice of descriptor affects the classification of palladium nanoparticles using labels generated using clustering.
Python programming and experience in data science and machine learning is essential (such as COMP3720, COMP4660, COMP4670, COMP6670, COMP8420). Familiarity with platforms such as scikit-learn, Pytorch, Tensorflow and Keras is desirable.
This is a 24cp project.
machine learning, materials informatics, classification, clustering