About

This website has been setup by me, Alex Hocking, as part of my PhD. I work in the Computer Science and Centre of Astrophysics Research departments at the University of Hertfordshire, UK where I apply machine learning to analyse astronomy data. I've been very lucky in finding amazing supervisors for this work (I'll add there names here when I have their permission!).


If you have any ideas to improve the site or you are interested in the technology please get in touch: a.hocking3 at herts.ac.uk or via or LinkedIn

The Motivation

We've been creating unsupervised machine learning systems to automatically analyse astronomy data including observations from an orbiting telescope (Hubble Space Telescope - CANDELS survey), a ground based telescope (DECam - DECaLS survey), and data from radio interferometers (ATLAS). The intention of the site is to make available the machine learning analysis of the HST:CANDELS observations to professional astronomers. I'll be building out the site to include the CANDELS galaxies categorized and ordered with photometric and redshift data. But, I have an additional motivation. When I started on my PhD journey I had a layperson's interest in astronomy but I've found the raw images of galaxies in the CANDELS survey to be truly astonishing and anyone interested should be able to see them, just as I have - and so here they are!

The Research

Our research into unsupervised machine learning technology at Herts is motivated by the need to automatically analyse large datasets of observations without using any training classifications or other data such as redshift. The ability to automatically classify objects in large surveys enables astronomers to analyse very large datasets to identify types of galaxy much more quickly than was possible using previous methods.

The Machine Learning System

The machine learning system works by analysing many millions of tiny squares of Hubble data to build a dictionary of representations. Galaxies are then located and the dictionary is used to create a representation of each galaxy. This galaxy representation can be used to compare galaxies and to identify which are visually similar to each other. We can then use a clustering approach to categorise galaxies into groups. The galaxy representations also make a 'find me a galaxy like this one' function also fairly straightforward to implement. This functionality will be added to site in the next couple of weeks. I just need to finish the REST service and then deploy the site to Amazon's cloud based servers.

The Data

All the data used by the machine learning system to produce the categorisations and the galaxy images for this site are from the Hubble Space Telescope. You can download the files from the Space Telescope Science Insitute: STScI: CANDELS

If you are interested in learning how to produce images from the Hubble Space Telescope data then check back in a couple of weeks. I'll add detailed instructions. To a layperson the astronomy acronyms and language used at the STScI site are daunting, but once you know a few simple terms and have the detailed steps at hand then it is a fairly straightforward process.