Testing NGMIX Star/Galaxy Separation¶
This started as an investigation into problems with redMaGiC galaxy selection at z~0.7, where there is an overabundance of bright red "galaxies" that are clearly misclassified stars. And it ended in a much darker place...
We're running ADDGALS tests, and using a redMaGiC-like red galaxy selection to do sanity checks on whether the red fraction looks the same in the data as in the mocks. At z<0.6 in the latest test runs things look good (with a realistic SVA1-SPTE error model), but at 0.6<z<0.8:
Here the plot is galaxies per square degree vs absolute magnitude, blue points are SVA1-SPTE and red points are mocks over the same footprint. At the bright end, especially at chisq > 2 (so red but not perfectly red) there's a real excess of objects. Visual inspection shows that these are indeed stars. (As a side note, they look even more starlike in Y1A1 data, which has noticeable better image quality. Yay!)
This is a particular problem at these redshifts because this is where the stellar locus comes close to the red sequence (though not a perfect match...)
In any event, it's a significant contamination...and many of these misclassified stars are also appearing as redMaPPer cluster members (d'oh!).
Using NGMIX for Star/Galaxy Separation¶
The NGMIX catalog has a pretty cool size measurement, EXP_T_S2N. This is an estimate of the signal-to-noise ratio of intrinsic size of the object, before PSF convolution, using a multi-epoch, multi-band fit. So the maximum amount of information is used (if the seeing is better in one band, this is incorporated into the measurement), and we avoid the problems of stacking (since the psf is better estimated for each individual CCD image). (Of course, there's still the brighter/fatter effect...)
Looking at SPTE, EXP_T_S2N does a very nice job of (apparently!) separating stars and galaxies, especially at i<22.5 (and it looks like a probabilistic estimate could be made as well).
To my eye, this is cleaner than the separation in (coadded) spread_model (not shown at the moment). Note that the scale is a log scale, and there are many more galaxies than stars at these magnitudes. But there are still a significant number of stars to be a problem if they aren't cleaned out well.
So what does the contamination look like using red sequence selection? Here we have good photo-zs of red galaxies (and wrong-o for the stars, of course). I've computed red sequence photo-zs ("zred2") for all objects in Gold SPTE, and taken those with chi2<20 (which are considered as possible inputs to redMaPPer/redMaGiC) (there is an additional selection of being brighter than 0.2L* at the photo-z). Note that at chi2>~5 the rate of photo-z failures increases, but this is a rough estimate of what could be going on.
I've looked at a few redshifts:
This is ngmix exp_t_s2n vs mag_auto. Red points are those objects that are classified as galaxies from MODEST_CLASS, but stars from EXP_T_S2N (<3). Fewer than 0.5% of the objects are misclassified for the two lower redshift bins, but for the 0.6<zred2<0.7 bin, 5% are failures. This is a problem!
But are these stars? Both MODEST and NGMIX can't both be right here...maybe this is a problem in NGMIX?
Testing Star Selection via Cross-Correlation¶
I've done some correlation function tests, assuming that (a) fainter stars should be spatially correlated with brighter stars (modulo disk/halo issues), and (b) stars should not be correlated with galaxies. All correlations are with Mike Jarvis' TreeCorr.
To get the mask, I matched all good objects from ngmix009 to Gold, and then pixelized the sky with nside=1024. All pixels with more than 100 objects were considered "good" (this is about 1.5sigma low from the mean number of objects). This footprint step had to be done because Erin and I haven't worked out the "NGMIX mask" yet (on the list for the next couple of weeks!) Random points were generated uniformly in these good pixels.
The reference catalog is bright stars that are considered stars with MODEST_CLASS 2, i < 21, EXP_T_S2N < 2. These are definitely stars.
The test catalogs are:
- Faint stars that agree: MODEST_CLASS 2, EXP_T_S2N < 2, 21< MAG_AUTO_I < 22.5 (400000)
- Faint galaxies that agree: MODEST_CLASS 1, EXP_T_S2N > 3, 21 < MAG_AUTO_I < 22.5 (2.3 million)
- MODEST_STARS that are not stars in NGMIX: MODEST_CLASS 2, EXP_T_S2N > 3 , 21 < MAG_AUTO_I < 22.5 (there are very few of these, actually, only 20403)
- NGMIX_STARS that are not stars in MODEST: MODEST_CLASS == 1, EXP_T_S2N < 2, 21 < MAG_AUTO_I < 22.5 (there are lots of these! 200000!)
That is, there are 50% more stars that are in NGMIX relative to MODEST at these magnitudes...this ends up as a contamination of almost 10% at the fainter magnitudes. This isn't peanuts!
Now for the correlations. Small scales are dicy because of noise and because of the mask not being correct. But....
The galaxies are basically uncorrelated with the bright stars. (Yay!) (Now, is this low enough not to matter at all? Ashley could tell us, I'm sure.)
The sure stars are correlated with the bright stars. And the NGMIX stars are nearly as correlated with the bright stars as the sure stars. These are stars not galaxies. The MODEST stars are more starlike than galaxylike but there's definitely a mix here. Probably not surprising since I've done some very simple hard cuts to make these estimates.