Welcome to ZAP’s documentation!

ZAP (the Zurich Atmosphere Purge) is a high precision sky subtraction tool which can be used as complete sky subtraction solution, or as an enhancement to previously sky-subtracted MUSE integral field spectroscopic data. The method uses PCA to isolate the residual sky subtraction features and remove them from the observed datacube. Though the operation of ZAP is not dependent on perfect flatfielding of the data in a MUSE exposure, better results are obtained when these corrections are made ahead of time. Future development will include expansion to more instruments.

Installation

Requirements

ZAP requires the following packages:

  • Numpy 1.6.0 or later
  • Astropy v1.0 or later
  • SciPy v0.13.3 or later

Many linear algebra operations are performed in ZAP, so it can be beneficial to use an alternative BLAS package. In the Anaconda distribution, the default BLAS comes with Numpy linked to OpenBlas, which can amount to a 20% speedup of ZAP.

Steps

ZAP can be installed using pip ::
pip install zap

Examples

In its most hands-off form, ZAP can take an input fits datacube, operate on it, and output a final fits datacube:

import zap
zap.process('INPUT.fits', outcubefits='OUTPUT.fits')

Care should be taken, however, since this case assumes a sparse field, and better results can be obtained by applying masks.

The main function is zap.process:

There are a number of options that can be passed to the code which we describe here:

zap.process(musecubefits, outcubefits='DATACUBE_ZAP.fits', clean=True, zlevel='median', cftype='weight', cfwidthSVD=100, cfwidthSP=50, pevals=[], nevals=[], optimizeType='normal', extSVD=None, skycubefits=None, svdoutputfits='ZAP_SVD.fits', mask=None, interactive=False)

Performs the entire ZAP sky subtraction algorithm.

Work on an input FITS file and optionally writes the product to an output FITS file.

Parameters:

musecubefits : str

Input FITS file, containing a cube with data in the first extension.

outcubefits : str

Output FITS file, based on the input one to propagate all header information and other extensions. Default to DATACUBE_ZAP.fits.

clean : bool

If True (default value), the NaN values are cleaned. Spaxels with more then 25% of NaN values are removed, the others are replaced with an interpolation from the neighbors. The NaN values are reinserted into the final datacube. If set to False, any spaxel with a NaN value will be ignored.

zlevel : str

Method for the zeroth order sky removal: none, sigclip or median (default).

cftype : str

Method for the continuum filter: median or weight (default). For the weight method, a zeroth order sky is required (see zlevel).

cfwidthSVD : int or float

Window size for the continuum filter, for the SVD computation. Default to 100.

cfwidthSP : int or float

Window size for the continuum filter used to remove the continuum features for calculating the eigenvalues per spectrum. Smaller values better trace the sources. An optimal range of is typically 20 - 50 pixels. Default to 50.

optimizeType : str

Optimization method to compute the number of eigenspectra used for each segment: none, normal (default), enhanced. If none, the number of eigenspectra must be specified with nevals or pevals, otherwise normal is used.

pevals : list

Allow to specify the percentage of eigenspectra used for each segment. If this is used, the pevals is ignored. Provide either a single value that will be used for all of the segments, or a list of 11 values that will be used for each of the segments.

nevals : list

Allow to specify the number of eigenspectra used for each segment. If this is used, the pevals is ignored. Provide either a single value that will be used for all of the segments, or a list of 11 values that will be used for each of the segments.

extSVD : str

Path of an input FITS file containing a SVD computed by the SVDoutput() function. Otherwise the SVD is computed.

skycubefits : str

Path for the optional output of the sky that is subtracted from the cube. This is simply the input cube minus the output cube.

svdoutputfits : str

Output FITS file containing the eigenbasis. Default to ZAP_SVD.fits.

mask : str

A 2D fits image to exclude regions that may contaminate the zlevel or eigenspectra. This image should be constructed from the datacube itself to match the dimensionality. Sky regions should be marked as 0, and astronomical sources should be identified with an integer greater than or equal to 1. Default to None.

interactive : bool

If True, a zclass object containing all information on the ZAP process is returned, and can be used to explore the eigenspectra and recompute the output (with the reprocess() method). In this case, the output files are not saved (outcubefits and skycubefits are ignored). Default to False.

The code can handle datacubes trimmed in wavelength space. Since the code uses the correlation of segments of the emission line spectrum, it is best to trim the cube at specific wavelengths. The cube can include any connected subset of these segments. (for example 6400 - 8200 Angstroms)

[0,    5400]
[5400, 5850]
[5850, 6440]
[6440, 6750]
[6750, 7200]
[7200, 7700]
[7700, 8265]
[8265, 8602]
[8602, 8731]
[8731, 9275]
[9275, 10000]

Sparse Field Case

This case specifically refers to the case where the sky can be measured in the sky frame itself, using:

zap.process('INPUT.fits', outcubefits='OUTPUT.fits')

In both cases, the code will create a resulting processed datacube named DATACUBE_ZAP.fits and an SVD file named ZAP_SVD.fits in the current directory. While this can work well in the case of very faint sources, masks can improve the results.

For the sparse field case, a mask file can be included, which is a 2d fits image matching the spatial dimensions of the input datacube. Masks are defined to be >= 1 on astronomical sources and 0 at the position of the sky. Set this parameter with the mask keyword

zap.process('INPUT.fits', outcubefits='OUTPUT.fits', mask='mask.fits')

Filled Field Case

This approach also can address the saturated field case and is robust in the case of strong emission lines, in this case the input is an offset sky observation. To achieve this, we calculate the SVD on an external sky frame using the function zap.SVDoutput

zap.SVDoutput(musecubefits, svdoutputfits='ZAP_SVD.fits', clean=True, zlevel='median', cftype='weight', cfwidth=100, mask=None)

Performs the SVD decomposition of a datacube.

This allows to use the SVD for a different datacube.

Parameters:

musecubefits : str

Input FITS file, containing a cube with data in the first extension.

svdoutputfits : str

Output FITS file. Default to ZAP_SVD.fits

clean : bool

If True (default value), the NaN values are cleaned. Spaxels with more then 25% of NaN values are removed, the others are replaced with an interpolation from the neighbors.

zlevel : str

Method for the zeroth order sky removal: none, sigclip or median (default).

cftype : str

Method for the continuum filter: median or weight (default). For the weight method, a zeroth order sky is required (see zlevel).

cfwidth : int or float

Window size for the continuum filter, default to 300.

mask : str

Path of a FITS file containing a mask (1 for objects, 0 for sky).

An example of running the code in this way is as follows:

zap.SVDoutput('Offset_Field_CUBE.fits', svdfn='ZAP_SVD.fits', mask='mask.fits')
zap.process('Source_cube.fits', outcubefits='OUTPUT.fits', extSVD='ZAP_SVD.fits', cfwidthSP=50)

The integration time of this frame does not need to be the same as the object exposure, but rather just a 2-3 minute exposure. Often residuals can be further reduced by changing cfwidthSP to a smaller value. However, this parameter should not be reduced to smaller than 15 pixels.

Extra Functions

Aside from the main process, two functions are included that can be run outside of the entire zap process to facilitate some investigations.

zap.nancleanfits(musecubefits, outfn='NANCLEAN_CUBE.fits', rejectratio=0.25, boxsz=1)

Detects NaN values in cube and removes them by replacing them with an interpolation of the nearest neighbors in the data cube. The positions in the cube are retained in nancube for later remasking.

Parameters:

musecubefits : str

Input FITS file, containing a cube with data in the first extension.

outfn : str

Output FITS file. Default to NANCLEAN_CUBE.fits

rejectratio : float

Defines a cutoff for the ratio of NAN to total pixels in a spaxel

before the spaxel is avoided completely. Default to 0.25

boxsz : int

Defines the number of pixels around the offending NaN pixel. Default to 1, which looks for the 26 nearest neighbors which is a 3x3x3 cube.

zap.wmedian(spec, wt, cfwidth=100)

Performs a weighted median filtering of a 1d spectrum

Operates using a cumulative sum curve

Parameters:

spec : numpy.ndarray

Input 1d spectrum to be filtered

wt : numpy.ndarray

A spectrum of equal length as the input array to provide the weights.

cfwidth : int or float

Window size for the continuum filter, for the SVD computation. Default to 100.

Command Line Interface

ZAP can also be used from the command line:

python -m zap INPUT_CUBE.fits

More information use of the command line interface can be found with the command

python -m zap -h

Interactive mode

ZAP can also be used interactively from within IPython

import zap
zobj = zap.process('INPUT.fits', interactive=True)

The run method operates on the datacube, and retains all of the data and methods necessary to process a final data cube in a python class named zclass. You can elect to investigate the data product via the zclass, and even reprocess the cube with a different number of eigenspectra per region. A workflow may go as follows:

import zap
from matplotlib import pyplot as plt

# allow ZAP to run the optimize routine
zobj = zap.process('INPUT.fits', optimization='normal', interactive=True)

# plot the variance curves and the selection of the number of eigenspectra used
zobj.plotvarcurve(5)

# plot a spectrum extracted from the original cube
plt.figure()
plt.plot(zobj.cube[:,50:100,50:100].sum(axis=(1,2)), 'b', alpha=0.3)

# plot a spectrum of the cleaned ZAP dataproduct
plt.plot(zobj.cleancube[:,50:100,50:100].sum(axis=(1,2)), 'g')

# choose just the first 3 spectra for all segmments
zobj.reprocess(nevals=3)

# plot a spectrum extracted from the original cube
plt.plot(zobj.cube[:,50:100,50:100].sum(axis=(1,2)), 'b', alpha=0.3)

# plot a spectrum of the cleaned ZAP dataproduct
plt.plot(zobj.cleancube[:,50:100,50:100].sum(axis=(1,2))), 'g')

# choose some number of modes by hand
zobj.reprocess(nevals=[2,5,2,4,6,7,9,8,5,3,5])

# plot a spectrum
plt.plot(zobj.cleancube[:,50:100,50:100].sum(axis=(1,2))), 'k')

# Use the optimization algorithm to identify the best number of modes per segment
zobj.optimize()

# compare to the previous versions
plt.plot(zobj.cleancube[:,50:100,50:100].sum(axis=(1,2))), 'r')

# identify a pixel in the dispersion axis that shows a residual feature in the original
plt.figure()
plt.matshow(zobj.cube[2903,:,:])

# compare this to the zap dataproduct
plt.figure()
plt.matshow(zobj.cleancube[2903,:,:])

# write the processed cube as a single extension fits
zobj.writecube('DATACUBE_ZAP.fits')

# or merge the zap datacube into the original input datacube, replacing the data extension
zobj.writefits(outcubefits='DATACUBE_FINAL_ZAP.fits')
class zap.zclass(musecubefits)

Main class to run each of the steps of ZAP.

Attributes

cleancube (numpy.ndarray) The final datacube after removing all of the residual features.
contarray (numpy.ndarray) A 2D array containing the subtracted continuum per spaxel.
cube (numpy.ndarray) The original cube with the zlevel subtraction performed per spaxel.
especeval (list of (eigenspectra, eval)) A list containing the full set of eigenspectra and eigenvalues generated by the SVD calculation that is used toy reconstruct the entire datacube.
laxis (numpy.ndarray) A 1d array containing the wavelength solution generated from the header parameters.
wcs (astropy.wcs.WCS) WCS object with the wavelength solution.
lranges (list) A list of the wavelength bin limits used in segmenting the sepctrum for SVD.
nancube (numpy.ndarray) A 3d boolean datacube containing True in voxels where a NaN value was replaced with an interpolation.
nevals (numpy.ndarray) A 1d array containing the number of eigenvalues used per segment to reconstruct the residuals.
normstack (numpy.ndarray) A normalized version of the datacube decunstructed into a 2d array.
varlist (numpy.ndarray) An array for each segment with the variance curve, calculated for the optimize method.
pranges (numpy.ndarray) The pixel indices of the bounding regions for each spectral segment.
recon (numpy.ndarray) A 2d array containing the reconstructed emission line residuals.
run_clean (bool) Boolean that indicates that the NaN cleaning method was used.
run_zlevel (bool) Boolean indicating that the zero level correction was used.
stack (numpy.ndarray) The datacube deconstructed into a 2d array for use in the the SVD.
subespeceval (list of (eigenspectra, eval)) The subset of eigenvalues and eigenspectra used to reconstruct the sky residuals.
variancearray (numpy.ndarray) A list of length nsegments containing variances calculated per spaxel used for normalization
y,x (numpy.ndarray) The position in the cube of the spaxels that are in the 2d deconstructed stack
zlsky (numpy.ndarray) A 1d array containing the result of the zero level subtraction

Methods

chooseevals
make_contcube
mergefits
optimize
plotvarcurve
reconstruct
remold
reprocess
writeSVD
writecube
writeskycube
chooseevals(nevals=[], pevals=[])

Choose the number of eigenspectra/evals to use for reconstruction.

User supplies the number of eigen spectra to be used (neval) or the percentage of the eigenspectra that were calculated (peval) from each spectral segment to be used.

The user can either provide a single value to be used for all segments, or provide an array that defines neval or peval per segment.

make_contcube()

Remold the continuum array so it can be investigated.

Takes the continuum stack and returns it into a familiar cube form.

mergefits(outcubefits)

Merge the ZAP cube into the full muse datacube and write.

optimize(*args, **kwargs)

Function to optimize the number of components used to characterize the residuals.

This function calculates the variance per segment with an increasing number of eigenspectra/eigenvalues. It then deterimines the point at which the second derivative of this variance curve reaches zero. When this occurs, the linear reduction in variance is attributable to the removal of astronomical features rather than emission line residuals.

reconstruct(*args, **kwargs)

Reconstruct the residuals from a given set of eigenspectra and eigenvalues

remold()

Subtracts the reconstructed residuals and places the cleaned spectra into the duplicated datacube.

reprocess(pevals=[], nevals=[])

A method that redoes the eigenvalue selection, reconstruction, and remolding of the data.

writeSVD(svdoutputfits='ZAP_SVD.fits')

Write the SVD to an individual fits file.

writecube(outcubefits='DATACUBE_ZAP.fits')

Write the processed datacube to an individual fits file.

writeskycube(skycubefits='SKYCUBE_ZAP.fits')

Write the processed datacube to an individual fits file.