Testing Correlation Functions with Pixelized Mangle Masks

-Eli Rykoff & Eduardo Rozo

One of the issues that has come up is what biases will appear if we use a pixelized mangle mask? (The advantage of pixelizing is of course that things scale better to the full DES survey. The disadvantage is that we lose the small-scale variations.)

As a test, Eduardo and I have discussed generating a fake data set that has zero galaxy correlation. Then, when we analyze these data using randoms generated from the various pixelized masks...what happens to the correlation function? Is it biased? Note that this test has nothing to say about the detection problems due to stars, blending, etc. This is simply to answer the question of the effect of pixelizing the mangle mask.

Also- this test only addresses additive systematics. We have not addressed multiplicative systematics here (though if we'd have to guess, any mask systematics are probably additive rather than multiplicative).

Pixelized Masks

First, I have produced healpix pixelized masks at various NSIDE values for the SPT-E region i-band, covering 244.9 deg^2. Note that since these are fake galaxies that I'm placing in the field there's no need to worry about the LMC.

I start with NSIDE=65536 (0.003 arcmin^2, or 0.05 arcmin on a side). This is going to be a true representation of the mask because it captures essentially all of the variability in the mangle mask at a very fine scale. Note that this file is 6.9Gb just to list all the pixel numbers and weights for SPT-E...this does not scale particularly well to larger areas.

Then I progressively "de-res" this fine scale mask to NSIDE={32768,16384,8192,4096,2048,1024,512,256}. The deres procedure first involves taking all high resolution subpixels in the coarse pixelization. (This is easy in the nest scheme). All subpixels that have non-zero weight are averaged together, and then a "fracbad" is computed which is the fraction of sub-pixels pixels that have zero weight. The fracbad can be used both to reject bad pixels where we expect the coarse pixelization is unsound, and to probabilistically remove random points to approximate the real density of galaxies on the given scale.

The format of the mask files is:

  • PIXEL: the healpix nested pixel
  • WEIGHT: the average mangle weight (depth) in the pixel
  • FRACBAD: the fraction of bad pixels
In addition, the fits header has the fields:
  • NSIDE: nside for the healpix mask
  • NSUB: number of subpixels used to compute fracbad
  • NEST: to indicate the nest ordering scheme

This information is also in the filename.

Note that these are not standard healpix files, because having a full map would explode the memory on your computer.

Generating Fake Galaxies

To generate the galaxies, I used the COSMOS ACS imaging as "truth". COSMOS galaxies with i_ACS < 24.8 were used. The same space density of galaxies were used as in COSMOS (which matches quite well to that observed in SVA1 gold; in total there are ~15 galaxies per arcmin^2 at i<24). And the distribution of galaxies was completely uniform.

One list of true galaxies was generated for all the tests, to ensure that we're comparing apples-to-apples. The only difference is the generation of the randoms (though see below on masking).

Given a list of true magnitudes, I estimate observed magnitudes and errors by applying an error model based on the magnitude limit from the mangle mask at that (high resolution) point.

Generating Random Points

Generating random points is done in much the same way as the fake galaxies, using COSMOS as the truth and scattering magnitudes. Approximately 10x as many random points were generated as there are fake galaxies. However, unlike the fake galaxies, the true magnitudes for the randoms were perturbed using the pixelized mask, so we expect that the randoms will be imbued with structure on the scale of the pixelization.

There are a few additional changes to the random points. First, any pixel that has a fracbad > 0.2 is rejected as being unreliable. This has the effect of trimming off ~1% of the area for NSIDE=4096 (and turning the larger round star holes into diamond-shaped healpix-sized holes). Second, when 0.0<fracbad<0.2, then we sample the random points and only keep (1-fracbad) of the random points to "fuzz out" the edges. Finally, because we applied an additional mask based on fracbad, we apply this same coarse mask to the fake galaxy catalog. In real data, if we use a pixelized mask we will want to make sure that the galaxy distribution reflects the pixelized mask and not the ideal mask.

The masked galaxy files are here: And the corresponding random point files are here:

Correlation Functions

To compute correlation functions, I have used Mike Jarvis' 2 point correlation function code . This uses the Landy-Szalay statistic, omega = (DD-2DR+RR)/RR. I am not plotting errors since they are underestimated.

I have done 3 runs, with 21<i<21.5; 22<i<22.5; and 23<i<23.5. The statistics are obviously better for the fainter galaxies, but we also expect there to be a greater bias because when you're much brighter than the limiting magnitude the exact details of the depth map shouldn't matter (modulo the star holes).

Finer Scales: NSIDE >= 4096

This is very good news: all the different sets of random points yield identical results to scales as small as 0.5'. Only at ~0.3' or so does the nside=4096 run show any additional biases for the 23<i<23.5 bin. (Though even so, some of this just may be noise; the "perfect" random points using nside=65536 also show the possibility of some structure at 0.2 arcmin, though it's unclear if this is significant.)

Therefore, using nside=4096 (with some sub-sampling) looks like it should be sufficient for most needs except at the finest scales. (And I would guess that other effects, such as crowding and deblending, will dominate the pixelization problems at these scales.)

Coarser scales: NSIDE < 4096

Note scale is different from above

When we go to the coarser scales (NSIDE < 4096), though, the problems do develop. They are all fine at the bright end (thus, the star holes are not significant, at least with the subsampling, which is good). However, at the faint end it is clear that using a coarse pixelization scheme will induce biases in the correlation function. (This also makes it clear that the uptick at scales < 0.2' for nside=4096 appear to be real).

Correlation Functions with Approximately Log-Normal Input

The LSS folks have also made available some random points over the SPTE footprint with an approximately log-normal correlation function (see here ). After cuts, I have approximately 10 galaxies per sq. arcmin, a little bit smaller than the target, but plenty to see what's going on. Note that the input correlation function is probably biased at scales < 5Mpc/h (the way the simulations were run) and certainly biased at scales <1.5 Mpc/h (the resolution of the simulations). On the other hand, these input random points are adequate for the present tests, which is to primarily test the bias imparted by using a pixelized mask. (Note in particular, that this says nothing about crowding, proximity effects, etc, etc.)

Finer Scales

And the residuals compared to the "truth" where the random points are generated with NSIDE=65536:

These residuals look much the same to the no-correlation tests above: nside=4096 is fine to scales of ~0.3-0.5'. Great!

Coarse Scales

And the residuals compared to the "truth" where the random points are generated with NSIDE=65536:

And as above, you don't want to go coarser than nside=4096 unless you stick to the brightest galaxies.


I have tested the effect of both additive bias and multiplicative bias using uncorrelated galaxies, and log-normal correlations. And the impact of pixelization is negligible except at scales finer than 0.3-0.5', where certainly other effects are also coming into play. This is good news for pixelizing the masks!

However, there are additional problems that may develop when we combine masks for different bands, but that is a much thornier issue.