Conceptual Overview of the Workflow

The data reduction software in the gemini package can be used in a variety of ways to reduce your science data. That generality can get in the way of understanding the process at a conceptual level, separately from the mechanics of invoking the tasks. This chapter offers a tour of the what to do and why it is done in a particular way, rather than the how something is done.

The path through data reduction is somewhat different for each observing configuration, though there is some commonality, particularly in creating MasterCal reference files.

What the GEMINI and GMOS Packages Offer

Parameters and Characterizations

Many of the instrument parameters and data characterizations necessary for data reduction have been provided in the gmos package. The parameters related to these characterizations are stored in data files in the header metadata, the gmos$data/ or gmos$cal/ package directories, or within IRAF package directories, and are accessed by the data processing tasks. They include:

  • Size, location, and orientation of the detectors and the various apertures in the focal plane

  • For each detector that has ever been used in the focal plane:

    • size of the photo-active and overscan regions

    • Bad pixel lists (for some configurations)

    • Values for the gain and readnoise for each RoI

  • Approximate dispersion coefficients for each grating

  • Filter transmission curves

Many of these values can be overridden with task parameters or custom reference files.

A Uniform Model of the Data

GMOS data are complex, and the gemini.gmos package attempts to hide that complexity from the user (with at least partial success). There are 3 or 4 dimensions of complexity, depending upon the observing configuration, that are represented in FITS file names or MEF content:

  • multiple sensors in the focal plane,

  • concomitant data that describe the science arrays at the pixel level,

  • multiple spectral slitlets (MOS slitlets, longslit sources, or IFU apertures), and

  • multiple stages of intermediate processing stored in FITS files.

This complexity is not an elegant match to FITS or IRAF, which partially explains why it is difficult to use traditional software tools to reduce GMOS data, or to interleave these tools within the gmos package reduction workflow. Moreover, the FITS standard (and its support within IRAF) has limited ability to represent the semantic content and organization of the various data objects, apart from a linear sequence of extensions in a single file. See Data Packaging for an overview of MEF file content and organization.

Concomitant Data

GMOS data include a number of pieces, including concomitant data. Some of these are created during data reduction. They include the following:

  • Science array, one SCI extension per amplifier

  • Variance array, VAR extension (optional)

  • Data quality array, DQ extension (optional)

  • Processing metadata, stored in FITS HDU keywords

  • Processing logs, as separate ASCII files

where:

  • Raw Images are images consisting of arrays from multiple CCDs (and overscan sections on the CCDs), all stored in the same file. Recall that the spectral orders span the sensors.

  • Bad Pixel Masks are arrays of bit-encoded values for each pixel of the science array indicating the data quality conditions that apply, where zero indicates no DQ conditions.

  • Variance planes are computed as the sum in quadrature of the variance from each processing stage, which for the most part is based on Poisson statistics.

The variance of the reduced science array is computed as:

\[\begin{split}\mathrm{Var}(S_c) = (R/G)^{2} &+ \max(S_r-B,0.0)/G + \mathrm{Var}(O) + \mathrm{Var}(B) + \mathrm{Var}(D)) + \\ &+ \frac{\mathrm{Var}(F)\times(S_r-B)^{2}}{F^{4}}\end{split}\]

where Var(X) is the variance of the array X, and:

  • B is the Bias Residual MasterCal

  • F is the (normalized) Flat-field MasterCal

  • G is the scalar gain in e/ADU

  • O is the overscan region of the science array

  • R is the scalar read noise in electrons

  • \(S_c\) is the calibrated science array

  • \(S_r\) is the raw science array

Note

The expression above for the variance does not include cross-terms to account for correlated noise between neighboring pixels, which would be appropriate once the data have been resampled.

File Nomenclature

It is usually simplest during data reduction to retain the filenames of raw exposures as provided by the Gemini Observatory Archive, and to allow gmos processing tasks to take care of naming output files. The raw filename template is the following:

<site><yyyy><mm><dd> S <seq> .fits

where S and .fits are literals, and:

  • <site> is one of [N | S] for GMOS-N or GMOS-S

  • <yyyy><mm><dd> is the year, numerical month, and UT day of observation

  • <seq> is a running sequence number within a UT day

The GMOS convention for naming output files is to prepend one or more characters to the input filename. This occurs for each intermediate stage of data reduction processing, and is summarized in the table below. Unfortunately the characters used are not entirely unique, so the meaning of a few of them must be derived from context.

Processing Prefixes

Prefix

Applies to:

Description

a

Spec

aperture summed spectra

b

Img

background subtracted image

c

Spec

flux-calibrated spectra

d

IFU

IFU spectral cube

e

Spec

extracted spectra

f

Img

fringe-corrected image

g

Img+Spec

File “gsprepare-d” for reduction

gs

LS+MOS

spectra reduced with gsreduce

n

Spec

sky-subtracted spectra obtained in N&S mode

p

Img+Spec

[Informal] indicates bad pixels replaced with interpolated values

q

Img+Spec

QE corrected image

s

LS+MOS

sky-subtracted spectra

r

Img

images reduced with gireduce

t

Spec

wavelength-calibrated; rectilinear spectral image

x

Spec

spectral image after cosmic-ray rejection

Note that Spec is used above to indicate applicability to all spectral modes: LS, MOS, and IFU.

Process Integrity

Some gmos tasks are meta-tasks, in that they call other gemini or IRAF tasks to perform most stages of data processing. In some cases, particularly for IFU reductions, a meta-task performs part of the processing, then one or more other tasks perform specialized steps, then the meta-task is resumed for the remainder of the processing. In this sense, these meta-tasks are re-entrant. This flexibility means that gmos tasks must do a great deal of integrity checking on the input data, including:

  • Input files are all accessible, and no output file will be overwritten,

  • The science data and MasterCals match in RoI, binning, gain, readout speed, and (if relevant) filter,

  • Header metadata indicate that preceding processing steps have been performed,

  • The number of file extensions is correct, and

  • The input image dimensions are correct/consistent with MasterCal files

If any of the preconditions are not met, processing will halt (probably in a very inconvenient place). The various processing steps are recorded in the image header; see GMOS Processing Keywords.

All processing steps (and many of the input parameters) are written to the processing log. Error messages, if any, are also written to the log. Check the log if the processing goes awry.

GMOS Data Reduction Overview

Data processing for all GMOS configurations begins with preparing all relevant Master Calibration reference (MasterCal) files. See Creating Master Reference Files for details. The Bias Residual and Dark MasterCals are applied in raw pixel space, and so are prepared in exactly the same way for all configurations. All subsequent processing of science exposures and the creation of other MasterCals depends upon the application of these fundamental MasterCals.

Note

Separate MasterCals must be prepared for each setting of the gain, read-out speed, and RoI. Dark MasterCals are generally not needed except for Nod-and-Shuffle (N&S) operating mode.

Steps Common to All Workflows

  1. Data Preparation. Raw data from the observing environment lack important metadata in the headers that are essential for data reduction and for documenting the provenance of the reduced data products. This initial step inserts these metadata, and for spectroscopic modes appends the appropriate Mask Definition File (MDF) as a table extension.

  2. Overscan Correction. The first step in reducing all GMOS data is to perform the overscan correction–i.e., characterize the DC bias level during read-out from the overscan pixels, subtract it from the pixel array, then remove the overscan region from the image.

  3. Bias Residual Removal. For most data the next step is to subtract the bias residual, which is the low-amplitude, higher-order structure that remains after the overscan correction. This step may not be helpful for Arc exposures if they were obtained in fast-readout mode.

The order and method for most other steps depends upon the observing configuration, as described in the following subsections. Note that it is scientifically desirable, but not necessary, to delay mosaicing the detector arrays until after flat-fielding in order to avoid performing the operation on resampled data.

Imaging Workflow

Reduction of images is simplest of all the operating modes. A diagram of the nominal workflow is shown below:

../_images/Workflow_img.png

Fig. 1. Nominal order of processing for GMOS imaging data. Successive columns show the conceptual operation, the task for accomplishing the step, and the type of science or calibration data to which the processing step applies. Color background in each column shows the steps that apply (shaded) or the output MasterCal product (dark shaded) named in the column header; steps that are optional or that may not always apply are light shaded. Intermediate products from steps above the double line contain up to 3 image extensions (SCI, and optionally VAR and DQ) for each CCD.

Reduction Synopsis

Continuing with the basic image reduction steps, we have:

  1. QE Correction. Since the Flat-field MasterCals are normalized separately per CCD, the ratio of the quantum efficiencies for CCDs 1 and 3 relative to 2 must be applied separately.

  2. Flat-field Correction. The Flat-field MasterCals for each filter are usually created from twilight flats, and are normalized to a mean of 1.0 over all non-flagged pixels before being divided out of the science frames.

  3. Gain Correction. The gain is divided out of each amplifier array, leaving image extensions with brightness units of electrons.

  4. Fringe Correction. For red passbands where fringing is an issue (i’ and z’ bands), a Fringe MasterCal may be constructed from a large number of science frames with different telescope pointings, so that there is some background everywhere in the combined frame once sources are excluded. The Fringe MasterCal is scaled to the amplitude of the fringe pattern in each science frame with the same filter, then subtracted from it.

  5. Mosaic Amps/Extensions. All extension images in each science file are resampled to the pixel coordinate grid of the central extension image, which accounts for chip gaps and slight rotations between the CCDs.

Beyond the Basics

  • It may be a good idea to refine the WCS if your goal is to derive highly accurate coordinates or offsets.

  • Use gemtools.imcoadd to combine separate, overlapping exposures in the same filter to enable deep source detection or to perform photometry on very extended sources. But be aware that the output may not be scientifically optimal, as this task creates the intersection, not the union, of the contributing images to the footprint of the first image in the list of files to co-add. Also, this task does not match PSFs nor account for varying sky brightness and differential atmospheric refraction.

  • Measuring source brightnesses and establishing a photometric zero-point may be accomplished with widely available photometry packages, such as SExtractor.

Proceed to Reduction of Images with IRAF or Reduction of Images with PyRAF.

Long-slit Workflow

Slit spectroscopy is the most popular GMOS operating mode; the workflow is illustrated below. Note that the order of the operations depends somewhat upon how the Flat-field MasterCal is constructed: if it has been mosaiced, then images to be flat-fielded will also be mosaiced before the flat is applied.

../_images/Workflow_ls.png

Fig. 2. Nominal order of processing for GMOS longslit and MOS spectroscopic data. Color coding as Fig. 1.

Reduction Synopsis

Following the Overscan and Bias corrections described above:

  1. Dark Correction. For N&S observing mode, a scaled Dark MasterCal should be subtracted. Best results are obtained if the dark exposure times are very close to those of the targets. For observing modes other than N&S, this step is normally not needed nor recommended.

  2. CR-Rejection. It is possible at this stage to identify and flag cosmic rays on single frames. But if you have multiple exposures with the same pointing and configuration, CR rejection can be deferred to just after the mosaicing step.

  3. Gain Normalization The gain for each sensor (in \(e^-\)/ADU) is divided out, leaving brightness units of \(e^-\)/pixel.

  4. Flat-field Correction. The image is divided by a normalized Flat-field MasterCal, where the response function to the flat-field source has been removed. This is done for each CCD if the MasterCal has not been mosaiced.

  5. Mosaic the CCDs. The image extensions for each CCD are mosaiced to form a single SCI image extension. The relative positions and orientations of the CCDs are taken into account, and the image extensions are all resampled to the pixel gridding of the central CCD.

  6. Combine identical exposures If more than one exposure was obtained with the same configuration and telescope pointing, images may be combined with outlier (e.g., cosmic-ray) rejection, scaling, and background offseting.

  7. Apply approximate dispersion solution. Keywords are recorded in the image header that describe the approximate zero-point and first-order terms of the dispersion solution. These terms will be updated when the wavelength calibration is applied.

  8. Wavelength Calibration/Transformation. The dispersion solution derived from the associated arc lamp exposure(s) (and for each slitlet for MOS mode) is written into the extension headers.

  9. Sky Subtraction. For longslit spectra regions along the slit that are free of emission from the target(s) may be specified and subtracted from the image at this stage. This may be particularly useful for emission line sources with little continuum, where normal aperture definition may fail. For normal continuum sources, sky subtraction may be performed during spectral extraction.

  10. Extract Spectra. Apertures are defined (usually interactively) for source(s) and sky region(s), and 1-D spectra are constructed for sources by summing along the cross-dispersion direction, and if not performed in the above step, for each target subtracting a spatial fit to the sky at each wavelength.

  11. Apply Flux Calibration. If the spectra require flux calibration, correct for the mean atmospheric absorption at the airmass of the target and apply the sensitivity calibration derived from one or more standard star spectra.

Proceed to Reduction of Long-Slit Spectra with IRAF.

MOS Workflow

Although the MOS reductions are a little more complicated than for longslit, the data reduction workflow is very similar, as illustrated in the figure below.

../_images/Workflow_mos.png

Fig. 3. Nominal order of processing for GMOS MOS spectroscopic data. Color coding as Fig. 1.

Reduction Synopsis

Following the Overscan and Bias corrections described above:

  1. Dark Correction. For N&S observing mode, a scaled Dark MasterCal should be subtracted. Best results are obtained if the dark exposure times are very close to those of the targets. For observing modes other than N&S, this step is normally not needed nor recommended.

  2. CR-Rejection. It is possible at this stage to identify and flag cosmic rays on single frames. But if you have multiple exposures with the same pointing and configuration, CR rejection can be deferred to just after the mosaicing step.

  3. Flat-field Correction. The image is divided by a normalized Flat-field MasterCal, where the response function to the flat-field source has been removed. This is done for each CCD if the MasterCal has not been mosaiced.

  4. Gain Normalization. The gain for each sensor (in \(e^-\)/ADU) is divided out, leaving brightness units of \(e^-\)/pixel.

  5. Mosaic the CCDs. The image extensions for each CCD are mosaiced to form a single SCI image extension. The relative positions and orientations of the CCDs are taken into account, and the image extensions are all resampled to the pixel gridding of the central CCD.

  6. Combine identical exposures. If more than one exposure was obtained with the same configuration and telescope pointing, images may be combined with outlier (e.g., cosmic-ray) rejection, scaling, and background offseting.

  7. Partition Slitlets. The MDF file defines the 2-D regions that will be extracted from the parent image. These regions are each stored as an image (SCI) extension in the output file, along with associated VAR and DQ arrays if applicable.

  8. Apply approximate dispersion solution. Keywords are recorded in the image header that describe the approximate zero-point and first-order terms of the dispersion solution. These terms will be updated when the wavelength calibration is applied.

  9. Wavelength Calibration/Transformation. The dispersion solution derived from the associated arc lamp exposure(s) (and for each slitlet for MOS mode) is written into the extension headers.

  10. Extract Spectra. Apertures are defined (usually interactively) for source(s) and sky region(s), and 1-D spectra are constructed for sources by summing along the cross-dispersion direction, and for each target subtracting a spatial fit to the sky at each wavelength.

  11. Apply Flux Calibration. If the spectra require flux calibration, correct for the mean atmospheric absorption at the airmass of the target and apply the sensitivity calibration derived from one or more standard star spectra.

Proceed to Reduction of Multi-Object Spectra with IRAF.

IFU Workflow

Integral field unit observations are the most complex to process of all, largely because data reduction tools are less customized and flexible. Much of the processing consists of running the gfreduce task with some processing switches turned off to partially process the data, running customized tasks for operations that it doesn’t support, then re-running gfreduce with different switches enabled, and so on. The workflow is summarized in the figure below, and in the following narrative.

../_images/Workflow_ifu.png

Fig. 4. Nominal order of processing for GMOS IFU spectral imaging data. Color coding as Fig. 1.

Compared to reductions for other GMOS configurations, IFU data involve additional considerations:

  • There are a huge number of fibers (1000 plus 500 in the object and sky apertures, respectively), meaning it is impractical to perform fits interactively for determining calibrations.

  • The fibers are packed very close together at the slit, so that the dispersed spectra resemble tightly spaced spectral orders.

  • Instrument flexure can change the position of the fiber spectra by a significant fraction of their spatial extent on the CCDs. Contemporaneous flat-field exposures are needed to provide a reference for tracing the position of the fiber spectra; corrections for radial velocity may also be advisable.

Reduction Synopsis

Following the overscan and bias corrections, IFU processing generally consists of the following:

  1. Gain Normalization The gain for each sensor (in \(e^-\)/ADU) is divided out, leaving brightness units of \(e^-\)/pixel.

  2. Insert Static BPM. The Static Bad Pixel Mask (BPM) is inserted into the Science MEF (or replaces the one that was generated in prior processing) as DQ extensions. This step is performed automatically for the other GMOS configurations (imaging, LS, MOS).

  3. CR-Rejection. Cosmic rays and bad pixels can have an outsized impact on the extraction and calibration of the closely spaced fiber spectra. The bad columns and any marked but unremoved CRs need to be interpolated over, so as not to confuse down-stream processing. Generally cosmic-ray rejection (and SCI image interpolation) is performed on single images, as instrument flexure makes it inadvisable to combine all but short exposures that were obtained in sequence.

  4. Scattered light correction. Inter-fiber scattered light is significant for exposures of GCAL flats and well exposed targets, such as standard stars. Removing it is necessary for accurate flux calibration.

  5. QE Correction. The relative quantum efficiency between the CCDs is adjusted to that of the central CCD. This correction is applied to the GMOS-N EEV and the GMOS-N and GMOS-S Hamamatsu sensors.

  6. Spectral extraction. The spectra are extracted over all CCDs and the extent of each fiber, using a contemporaneous GCAL flat exposure as the trace template. The relative positions and orientations of the CCDs are taken into account, and the image extensions are all resampled to the pixel gridding of the central CCD. The spectra from the object and sky apertures are written to separate image extensions.

  7. Flat-field Correction. The extracted fiber spectra are divided by a normalized Flat-field MasterCal, where scattered light and the response function to the flat-field source have been removed.

  8. Wavelength Calibration. The dispersion solution derived from the associated arc lamp exposure(s) (for each fiber) was used to define a geometric transformation, which when applied transforms the extracted fibers to the same, linear wavelength scale.

  9. Sky Subtraction. For two-slit mode, the sky emission may be determined from the sky aperture. For one-slit mode the sky must be determined by summing the fibers that do not include the target.

  10. Flux Calibration. A correction for atmospheric extinction is applied, followed by the application of the sensitivity function derived from one or more standard stars.

  11. Aperture Summation. Depending upon the science goals, it may be useful to combine the fiber spectra over a specified spatial extent of the focal plane.

Beyond the Basics

Some additional processing may be warranted, depending upon the observing program and the science goals.

  • Adjust Wavelength Zero-point. Narrow emission lines from the night sky may be used to determine a correction to the zero-point of the wavelength calibration.

  • Adjust Spatial Alignment. Correlate image features between exposures to detect any spatial mis-alignments, and adjust the CRVALi accordingly.

  • Merge IFU Data Cubes If more than one exposure of a target was obtained with the same configuration and telescope pointing, they may need to be resampled to a common wavelength or spatial grid before being combined.

Proceed to Reduction of IFU Spectral Images with IRAF.