SG separation challenge » History » Version 13
SG separation challenge¶
Now that several people are testing their own approaches:
- Cut-based with DESDM info (Eli, Diego, Nacho, Ryan...).
- Multi-class (Maayane)
- Random Forests (Ryan)
- Boosted Decision Trees (Nacho)
- Alternative Neural Network with probabilistic output (Chris Bonnett).
- Probability based on spread model and photometry (DES-Brazil)
I think the time is right and the codes are mature to launch a specific SG separation challenge, mimicking the successful photo-z WG exercise.
We have to establish:
- The training/validation/testing sample (COSMOS, others).
- Only stars and galaxies? What about QSOs, image artifacts?
- The metrics (Fixed cut, Fixed purity, Fixed Efficiency, ROC -- see example below).
- SVA1 systematics: correlations with depth, Galactic latitude, seeing, etc.
- Who/how to run it.
- Is there any gain combining them (a committee)?
- The schedule.
We suggest to use the same metric as in the DES star/galaxy separation (on simulation) paper (arXiv:1306.5236).
Completeness and Purity provided by a given classifier¶
We define the parameters used to quantify the quality of a star/galaxy classifier. For a given class of objects, X (stars or galaxies), we distinguish the surface density of well classified ob jects, N_X , and the misclassified objects, M_X .
- The galaxy completeness c^g is defined as the ratio of the number of true galaxies classified as galaxies to the total number of true galaxies.
- The stellar contamination f_s is defined as the ratio of stars classified as galaxies to the total amount of ob jects classified as galaxies.
- The purity p^g is defined as 1-f_s
Bellow are three different plots we suggest to use to assess the performances of each classifier.
Example, on simulations, from arXiv:1306.5236
purity as a function of magnitude (for fixed completeness, the threshold/cut is let free)¶