Sub-Pixelizing Mangle Masks


Currently, we have the full mangle mask which gives the 2" aperture depth at every point in the survey, and the healpix nside=4096 approximation of the mangle masks. The full mangle mask is ... unwieldy, which it would be nice to use a pixelized version. Note that this is close to the finest practical pixelization before desktop workstations will struggle with the memory requirements.

However, as currently implemented, the pixelization simply takes the depth at the central point of that pixel. As each pixel at nside=4096 is 0.75 arcmin^2, this has trouble around stars, boundaries, etc.

The Subpixel Method

A few of us (Alex D-W in particular) have been working on seeing what can be done to calculate the depth on a finer scale and then average it. Note that in order for this to work we need 2 output maps: the first has the average depth in the pixel where the image was not completely masked, and the second is the fraction of the pixel area that is not masked. In this way we can create an anti-aliased map that contains most of the information from the fine scales but is more practical to use. (In particular, this format is useful for cluster masked-region corrections.)

The reason that Aurelien has been doing a single-pixel value is that the regular mangle code is too slow for this procedure (which is optimized for other things.) Luckily, Erin Sheldon has adapted some super-fast code from Martin White in the pymangle python package that is over 50x faster than the default mangle code for looking up weights ... this is very useful!

The Reference

Although doing the full survey at very large nside is unwieldy, we can do a small region at larger nside without too much trouble (though we have to avoid trying to use the regular healpix map functions or you'll blow away your RAM.) As a reference, I have started with the i-band mask in the field of RXJ2248, and have run healpix with nside = 65536 (this is 256x more pixels than nside=4096).

The map looks like this at very high resolution, along with a zoom in to a section in the center:

It's also worth looking at how the map changes as we go from high to low resolution:

The files to make these maps are linked at the bottom: sva1_RXJ2248_nside*_nest_table_weights.fits . This has a PIXEL and WEIGHT with the given nside (nest ordering).

The Subpixel Tests

For a first test, I have taken the i-band mask in the field of RXJ2248, and run nside=4096 with 1, 4, 16, 64, and 256 sub-pixels. In practice, to do all of SVA1 (or the DES footprint), 256 subpixels is "doable" but significantly slower. If we can get away with 16 or 64 that would make things run a lot faster. (The run time for the whole lot on this field was 20 minutes, about half of that for the 256 subpixel run).

These runs are done at fine scale and then averaged down to the coarser nside=4096 scale. I look at the output weight and fraction-observed maps. The delta-mag histogram is shown for the nside=4096 subpixelized version relative to the high res (nside=65536) version. I calculate the fraction of "bad" pixels where the weight is misestimated by 0.1 mag or greater. There is also a histogram of the bad fraction relative to the high res run, and I note when the bad fraction is off by >5%. These are somewhat arbitrary cuts, but they get to the rate of bad outliers.

256 Subpixels

64 Subpixels

16 Subpixels

4 Subpixels

1 (sub)pixel

Higher Resolutions?

NSIDE=4096 is close to the highest resolution that is practical to store a full healpix map of the sky such that many of the healpix tools (mollview, for example) will work. However, significant data compression can be achieved by storing a list of pixel numbers and weights. What are the practical limitations for a 5000 deg^2 survey?

Well, storing pixel numbers (64 bit int) and weights (32 bit float) for 5000 deg^2 at nside=65536 would take 200 Gb per band. This is non-trivial (though in 5 years...?) Right now it would take 12 Gb/band for all of SVA1. Doing nside=32768 gets us down by a factor of 4 (50Gb/band). This actually may be practical, and as seen below only fails for about 1.5% of the area.

Meanwhile, just to be clear, the polygon files currently take ~5Gb/band for SPT-E (150 deg^2), extrapolating to 167 Gb/band for the full survey. So nside=32768 would be a significant savings over the full mangle mask, and is also more efficient to use (for some purposes at least).

Higher Resolution residual maps

These are what the residuals look like compared to hi-res if you take the center value of each pixel:


In terms of getting things roughly right, things actually converge quickly. As soon as we average over 16 subpixels we're at the limit of how well we can do with an nside=4096 map. However, it is apparent that nside=4096, even with subpixel averaging, is insufficient to capture the fine scale in the mask (as seen in the gif above). About 9% of the area has a misestimated depth (>0.1 mag), and of course this is spatially correlated.

In terms of getting the bad fraction, things are a bit different. First, we're dealing with a quantized value when we only have 16 or 64 subpixels, but that's something we can live with. But even with 16 subpixels we have 5% of the time the masked area is misestimated by more than 5%. On the other hand, with 64 subpixels only 2% of the pixels are misestimated by > 5% and most of these are only slight outliers.

But using nside=4096 with any subpixelization scheme is going to run into the hard limits that it's just not fine enough for the variations in the depth. How this impacts various science goals is TBD...