When thousands of Internet users helped astronomers classify types of galaxies through a project called Galaxy Zoo, some of them may not have realized that they were training a machine to do their job.
British astronomers say they used data from the project to develop a software algorithm for galaxy classification that matched the human-generated results 90 percent of the time. Such robot astronomers may well do the bulk of the work in future all-sky galactic surveys. But the research team's leader says we need not fear the rise of the machines: The point of the exercise is to liberate us humans to do the more interesting tasks.
The University of Cambridge's Manda Banerji explained that celestial surveys to come will have to analyze hundreds of millions of galaxies. Banerji herself is involved in one of those surveys, the Dark Energy Survey, which will look at 300 million galaxies over five years, starting in 2011. Another project known as the VISTA Hemisphere Survey will take pictures of galaxies over the entire southern celestial hemisphere.
"We're getting to that age where we can't viably do these things using the human eye," Banerji told me today.
In the coming age, improved image-classification software could handle the no-brainers first. "The idea is that if we can eliminate all the things that are pretty standard, and we can give humans just the 10 percent that's left, then we're only bothering the humans to look at the interesting objects," Banerji said.
Galaxy Zoo's organizers were co-authors on the latest study, which is already available in preprint and is set to appear in the Monthly Notices of the Royal Astronomical Society. So it doesn't come as any surprise to them that the fruits of their labors are being used to build better software. Thanks to Internet tools, 250,000 users of the Galaxy Zoo website have already checked 60 million galaxies and contributed to 16 scientific papers. In some cases, Galaxy Zoo users have even been listed as co-authors of those papers.
Getting a statistical handle on the cosmic distribution of galaxies is one of the big challenges for astrophysics today: How many are elliptical, or spiral, or clumpy and irregular? Does that distribution change with age? What other characteristics can be correlated with galaxy structure? Such questions could lead to hugely important answers: For instance, the Dark Energy Survey is designed to look for clues in galactic data that could help solve the mystery surrounding the universe's accelerating expansion.
The software developed by Banerji and her team attacks the galaxy-classification challenge using a method that's different from the tried-and-true human approach. Instead of merely eyeballing the shape of a specific galaxy, the algorithm looks at qualities such as color, brightness variations and texture. A reddish galaxy is more likely to be an elliptical, for example, while a bluish galaxy is more likely to be a spiral.
The researchers fed the software a database of galaxies with known shapes, and trained the software to match up those shapes with the other qualities. The fully trained software was then used to classify a bigger database of galaxies on its own, and the machine's verdict matched the humans' verdict more than 90 percent of the time. The other 10 percent tended to be relative oddballs - for example, a bluish galaxy that for some reason is elliptical.
The next step is to figure out what other qualities can be used to classify the oddballs correctly, and then upgrade the software. Or just outsource the job to a human.
In addition to Banerji, the authors of the paper appearing in the Monthly Notices of the Royal Astronomical Society, "Galaxy Zoo: Reproducing Galaxy Morphologies Via Machine Learning," include Ofer Lahav of University College London, Chris J. Lintott of the University of Oxford, Filipe B. Abdalla (UCL), Kevin Schawinski (Yale), Steven P. Banford (University of Nottingham), Dan Andreescu (LinkLab), Phil Murray (Fingerprint Digital Media), M. Jordan Raddick (Johns Hopkins), Anze Slosar (Brookhaven National Lab), Alex Szalay (Johns Hopkins), Daniel Thomas (University of Portsmouth) and Jan Vandenberg (Johns Hopkins).