Thales has supplied the French space agency CNES with a Big data processing infrastructure, encompassing hardware and software, for the GAIA mission that is setting out to map a billion stars.
Launched in December last year by the European Space Agency (ESA), the ambitious GAIA galactic surveying satellite is set to revolutionise astrometry as we know it.
During its planned five-year mission, GAIA will characterise one billion stars with unprecedented positional precision: to within 20 micro-arcseconds for stars of magnitude 15 — equivalent to the thickness of a human hair seen from a distance of 1,000 km. In orbit around the L2 Lagrange point, the GAIA satellite determines the fundamental astrometric and astrophysical parameters of observed stars with its spectro-photometer, spectrometer and astrometry instrument. Each star will be observed approximately 80 times during the course of the mission.
GAIA will thus enable scientists to establish a three-dimensional map of our galaxy and gain further insight into the structure and evolution of the Milky Way. It will also make significant contributions to advancing our understanding of exoplanets, the solar system, outer galaxies and fundamental physics.
The sheer number of stellar observations from GAIA will generate some 50 Gigabytes of data every day, amounting to more than 100 Terabytes of raw data by the end of the mission. After processing, the final data catalogue will be about one Petabyte (1015 bytes). Such huge volumes of data call for a solid data exploitation strategy.
Early in the project, specifications began to grow exponentially as research requirements piled up, multiplying data volumes 30-fold. It soon became clear that conventional data storage technologies would not be up to the task and a different technology more suited to GAIA’s needs had to be sought.
Using a Big data architecture, Thales has supplied CNES with a complete processing system, encompassing hardware and software, geared to the needs of the mission:
- An estimated one to three Petabytes of data to be stored and processed
- Scientific processing of tables containing up to 280 billion objects
- Complex handling of huge volumes of data: joining, filtering and transformation, etc.
- Complex sequencing and parallel processing, calling for tight orchestration and scheduling
The chief innovation consisted in using Web-based tools (log analysis, text databases) in a science context.
GAIA trains its sights on the galaxy