Hubble Space Telescope

Hubble Space Telescope

The HST Science Archive provides access to observations obtained with the Hubble Space Telescope (HST) operated by STScI(NASA) and ESAC (ESA).

Observations can be queried by various criteria such as object name, position, instrument, etc. and retrieved using the following links:

Search for Hubble Data

Search for Hubble Legacy Archive Data (High level data products)

The HST archive at CADC contains the following data:

  • All public (non-proprietary) HST Data (produced by the CADC's and ECF's cache system, see below)
    • The standard HST archive products from the active instruments (ACS, COS, STIS, WFC3) are kept current by the HST Cache system described in the bottom of this page.
    • Data from legacy instruments (FOC, FOS, HRS, WFPC, WFPC2) have gone through a final calibration run and is not foreseen to change anymore. (spectroscopy will appear very soon)

What is the HST Cache?

The cache is an envelope around HST archive file production. It is a set of database tables and software agents that ensures that all publically available HST science pipeline products are preprocessed and readily available from storage at all times. This includes mechanisms to discover newly observed datasets to insert, and automatic reprocessing of datasets which benefit from updates to reference files, available meta-data and general processing software upgrades.

Why do we need a cache?

Since 2002 all data from active instruments has been produced from scratch triggered by user requests. The reasoning behind the On The Fly Reprocessing (OTFR) and On The Fly Calibration (OTFC) pipelines was that it would guarantee that the archive user always would get her data equipped with the newest set of meta-data and calibrated according to the best methods available. This was a clear advantage to the previous system, where the raw data was produced centrally at the STScI and delivered to the partner-sites, essentially freezing that data in time. Another advantage of the system was that it conserved storage space as only the Hubble Space Telescope telemetry files and a few smaller auxiliary files needed to be stored, an important resource aspect when data is stored on optical disks in jukeboxes.

With the advent of cheap mass storage in form of hard-disk arrays this aspect became less important and a number of other drawbacks of the on-the-fly paradigm became apparent over time as well: Live processing of data requires that support is available at all times to resolve errors and bugs in the pipeline, a inevitable task when a system becomes as complex as this with such a heterogeneous set of data as input. Another drawback is the processing speed: Producing a dataset could take from several minutes to hours, which might not be an issue for the patient astronomer, but makes it impossible to expose the data through synchronous VO protocols. Next level efforts like data-mining/metadata harvesting and production of high-level data products is also enormously difficult in the on-the-fly world.

The advantages of the HST Cache are:

  • Faster access Speed
  • Shields users from processing errors
  • Direct programmatic $amp; VO protocol access to the data
  • Makes the archive less prone to overall system breakdowns.
  • Allows site interoperability and redundancy
  • Less maintenance in the long run
  • Allows harvesting of meta-data and data-mining
  • Data is often uniformely calibrated