HadCRUt3 vs. ERA Interim

Climate (or global atmospheric) reanalyses are an alternative way to assess how the global climate evolves over time, a blend of model and observation. They tend to include a multitude of variables, but I would like to focus on the one specifically pertaining to our recent discussion about GISTEMP vs. HadCRUt3: global temperatures.

There’s a host of different climate reanalyses around; among the most reputable ones, though, are those conducted by the American agencies NCEP (NOAA) and NCAR, the Japanese JMA, and the European ECMWF.

So, what is a climate reanalysis?

ECMWF explains:

“A climate reanalysis gives a numerical description of the recent climate, produced by combining models with observations. It contains estimates of atmospheric parameters such as air temperature, pressure and wind at different altitudes, and surface parameters such as rainfall, soil moisture content, and sea-surface temperature. (…)

ECMWF periodically uses its forecast models and data assimilation systems to ‘reanalyse’ archived observations, creating global data sets describing the recent history of the atmosphere, land surface, and oceans. Reanalysis data are used for monitoring climate change, for research and education, and for commercial applications.

Current research in reanalysis at ECMWF focuses on the development of consistent reanalyses of the coupled climate system, including atmosphere, land surface, ocean, sea ice, and the carbon cycle, extending back as far as a century or more. The work involves collection, preparation and assessment of climate observations, ranging from early in-situ surface observations made by meteorological observers to modern high-resolution satellite data sets. Special developments in data assimilation are needed to ensure the best possible temporal consistency of the reanalyses, which can be adversely affected by biases in models and observations, and by the ever-changing observing system.”

From NCAR/UCAR’s Climate Data Guide:

“Reanalysis [is] a systematic approach to produce data sets for climate monitoring and research. Reanalyses are created via an unchanging (“frozen”) data assimilation scheme and model(s) which ingest all available observations every 6-12 hours over the period being analyzed. This unchanging framework provides a dynamically consistent estimate of the climate state at each time step. The one component of this framework which does vary are the sources of the raw input data. This is unavoidable due to the ever changing observational network which includes, but is not limited to, radiosonde, satellite, buoy, aircraft and ship reports. Currently, approximately 7-9 million observations are ingested at each time step. Over the duration of each reanalysis product, the changing observation mix can produce artificial variability and spurious trends. Still, the various reanalysis products have proven to be quite useful when used with appropriate care.”

So a climate reanalysis is basically a rather complex procedure of assimilating tons of data from a multitude of different observing platforms within a specified model framework, aiming at turning a long string of more-or-less real-time (short-term) diagnostic states of a certain weather variable into an historically coherent (long-term) climate dataset of that same variable, essentially going from constantly recalibrated ‘initial values’ to more floating and inert ‘boundary values’. The nice thing about it is that it’s fundamentally observation-based; it only looks backwards in time, from where we have actual data, it needs not attempt to simulate future climate based on no data at all, like GCMs do.

This is generally how the method works:

“A numerical model determines how a model state at a particular time changes into the model state at a later time. Even if the numerical model were a perfect representation of an actual system (which of course can rarely if ever be the case) in order to make a perfect forecast of the future state of the actual system the initial state of the numerical model would also have to be a perfect representation of the actual state of the system.

Data assimilation or, more-or-less synonymously, data analysis is the process by which observations of the actual system are incorporated into the model state of a numerical model of that system. Applications of data assimilation arise in many fields of geosciences, perhaps most importantly in weather forecasting and hydrology.

A frequently encountered problem is that the number of observations of the actual system available for analysis is orders of magnitude smaller than the number of values required to specify the model state. The initial state of the numerical model cannot therefore be determined from the available observations alone. Instead, the numerical model is used to propagate information from past observations to the current time. This is then combined with current observations of the actual system using a data assimilation method.

Most commonly this leads to the numerical modelling system alternately performing a numerical forecast and a data analysis. This is known as analysis/forecast cycling. The forecast from the previous analysis to the current one is frequently called the background.

The analysis combines the information in the background with that of the current observations, essentially by taking a weighted mean of the two; using estimates of the uncertainty of each to determine their weighting factors. The data assimilation procedure is invariably multivariate and includes approximate relationships between the variables. The observations are of the actual system, rather than of the model’s incomplete representation of that system, and so may have different relationships between the variables from those in the model. (…)

As an alternative to analysis/forecast cycles, data assimilation can proceed by some sort of continuous process such as nudging, where the model equations themselves are modified to add terms that continuously push the model towards the observations.”

And from the above:

“A meteorological reanalysis is a meteorological data assimilation project which aims to assimilate historical observational data spanning an extended period, using a single consistent assimilation (or “analysis”) scheme throughout.


In operational numerical weather prediction, forecast models are used to predict future states of the atmosphere, based on how the climate system evolves with time from an initial state.


In addition to initializing operational forecasts, the analyses themselves are a valuable tool for subsequent meteorological and climatological studies. However, an operational analysis dataset, i.e. the analysis data which were used for the real-time forecasts, will typically suffer from inconsistency if it spans any extended period of time, because operational analysis systems are frequently being improved. A reanalysis project involves reprocessing observational data spanning an extended historical period using a consistent modern analysis system, to produce a dataset that can be used for meteorological and climatological studies.

(My emphasis.)

The “ERA Interim” temperature product

So, I wanted to see how one such climate reanalysis product stacked up against the more traditional one-source observational metrics, like – in the case of temperature anomalies – the instrumental and/or satellite records.

For this I chose ECMWF’s “ERA Interim“, replacing (and improving upon) their older, but highly successful ERA-40 product. ERA Interim is by many considered preeminent among the current (third-generation) climate reanalyses, going back to 1979. It’s by no means perfect, but its general quality is deemed solid by most, being already heavily used in climate research studies around the world.

To me, the most intriguing question was definitely this one:

How does ERA Interim position itself regarding the matter of global temperature anomaly evolution and the fairly dramatic divergence between the GISTEMP LOTI and HadCRUt3 products described and discussed in my previous post?

Luckily, the ERA Interim temperature data is available at KNMI Climate Explorer. So finding the answer proved to be a relatively easy task.

The ERA Interim temperature product includes data from levels at various intervals from the actual solid/liquid surface to the 200mb level around the global mean tropopause (~12 km above the surface).

Since the temperature data of the global surface records is expressed as a weighted mean of land (29.2%) and ocean (70.8%), a jolly mix of direct surface measurements (SSTs) and measurements taken ~2m above the ground (land stations), it felt natural to compare them to Interim’s T_sfc product as well as to its T_2m product. In any event, the differences between the two proved to be minor.


First, here’s HadCRUt3 gl (with the now – on this blog – standard 0.064K downward adjustment applied from Jan’98 onwards) versus ERA Interim T_2m gl, from 1979 to 2015:

H3 vs. ERAI T_2m ext

Figure 1.

Focusing on the 1995-2015 period:

H3 vs. ERAI T_2m

Figure 2.

Not a perfect fit, but all in all still very good indeed. I would actually say surprisingly good! You will note how the black ERAI T_2m curve does not rise even a tad higher than the lime green HadCRUt3 curve going from 1979 to 2014, and surely not from 1995 to 2014. It is of course thoroughly satisfying to see the validity of my Jan’98> H3 adjustment once again forcefully supported 🙂

Over to HadCRUt3 gl vs. ERA Interim T_sfc gl:

H3 vs. ERAI T_sfc ext

Figure 3.

And the same, only between 1995 and 2015:

H3 vs. ERAI T_sfc

Figure 4.

The same thing, really, with T_sfc as with T_2m. The Jan’98> adjustment vindicated (almost even more so here), and no difference in the overall rise in global temperature anomalies from 1979-80 to 2013-14, certainly not from 1995 to 2014.

But there is a particular pattern to be distinguished in both ERAI temp products when compared directly with H3 in this way, and that is a tendency towards a less pronounced “Pause”. The peak around El Niño 1997/98 seems distinctly to have slumped down, while the 2005-2012 segment appears just as clearly to have been lifted up.

This pattern actually very much resembles the way HadCRUt4 – UEA and UKMO’s current (‘upgraded’) version of global surface temperatures – differs from HadCRUt3, only in ERAI with the artificial lift across the 1997-98 transition having been appropriately sorted out.

As can be seen here:

H4 vs. ERAI T_2m & T_sfc a

Figure 5.

The spurious 1997-98 step warming originally showing up in the H3 dataset, but which was carried over without correction (or even notice) to the newest version (H4), is here clearly seen. However, watch what happens when we lift the two ERAI curves to make up for it:

H4 vs. ERAI T_2m & T_sfc b

Figure 6.

The fit from 1997 is now what I would call striking! Except during the last couple of years, when H4 veers off once again …

OK. Now, what about GISTEMP?

Here it is:

GISS vs. ERAI T_2m & T_sfc

Figure 7.

Bear in mind, the yellow GISTEMP curve is specifically aligned with the two ERAI curves (black and blue) at the start of the period (1995-97). From then on, though, you can quite easily observe how it gradually drifts away in an upward direction, the only interruption to this tendency being the 2005-2007 segment. In 2014 and 2015, the GISTEMP LOTI global mean curve appears to be averaging somewhere between 0.1 and 0.15 degrees higher than the ERAI curves.

Since the GISTEMP LOTI curve, including data from the narrowed-down (near-global) 60N-60S band only, happens to line up so well with the HadCRUt3 full-global (90N-90S) curve, as readily seen in the GISS-HadCRU post, I thought it would be interesting to try the same thing with the ERA Interim temperature product (limited to the T_2m level), just for the fun of it, to see if even here the inclusion of the polar caps somehow affects the global mean in a significant way.

Look at this:

ERAI 60-60 vs. H3 gl

Figure 8. HadCRUt3 90-90 vs. ERAI T_2m 60-60.

Yup. We see the 1997-99 segment (on both sides of El Niño 1997/98) substantially raised, and the 2005-2012 segment similarly lowered, to make for a much better match. The fit is closing in on perfect, the ERAI “Pause” now much more evident.

Still, the overall difference between ERAI T_2m 90-90 and 60-60 is nothing near the GISTEMP LOTI one:

ERAI gl vs. ERAI 60-60

Figure 9. ERAI T_2m 90-90 vs. ERAI T_2m 60-60.

As you can see, the black 90-90 curve has not lifted away from the red 60-60 curve going from 1995 to 2015. If anything, the opposite is happening towards the end. The differences all stem from intermediate tweaks up and down … (The Arctic influence on the global mean seems somewhat inflated also in the ERA Interim data, but not at all to the same absurd degree as in GISTEMP LOTI.)

IOW, the ERA Interim temperature product all in all appears to position itself firmly on the side of HadCRUt3, not GISTEMP LOTI, on the global temp anomaly issue.

Finally, for the sake of a certain measure of completeness, we’ll compare a couple of the ERA Interim temperature product’s tropospheric levels with the satellite records (represented here by UAH tlt v6 (beta4)). The 700mb level (Fig.10) is situated on average around 3km up in the air, while the 500mb level (Fig.11) is normally to be found about 5.5km above the ground. The weighting function of the “Lower Troposphere (LT)” level for the new UAH version dataset is centred on a column altitude of approximately 3.5-4 kilometres. That would be equivalent to a mean air pressure level of perhaps 630mb.

First, here’s the 700mb level:

UAH tlt v6 vs. ERAI T_700mb

Figure 10.

And the 500mb level:

UAH tlt v6 vs. ERAI T_500mb

Figure 11.

It seems obvious to me from these plots that ERA Interim’s global tropospheric temperature profile is highly influenced by data inputs mainly from the global – yet coverage-wise rather spotty – radiosonde network. And of course there’s nothing inherently wrong with this. But it does seem to produce some rather strange irregularities (conspicuous ‘steps’ up and down; uncorrected inhomogeneities?) relative to the satellites. I at least feel rather confident that the satellites are currently on safer ground than the radiosondes when it comes to rendering the actual evolution of global temp anomalies since the late 70s, because of how tightly they track the surface records (HadCRUt3, that is), and also because of how well they manage to match up with the CERES EBAF ToA fluxes (both ASR (Qin) and OLR (Qout)) since 2000 (previous post).

More on ERA Interim:





4 comments on “HadCRUt3 vs. ERA Interim

  1. oz4caster says:

    Very interesting. I have been looking at the GFS-based CFSR provided by the University of Maine Climate Change Institute. I noticed the same discrepancy with the NCEI estimates substantially higher than the CFSR estimates for the last couple of years.

    I pulled the ERAI data from UM CCI but it only goes through 2014. The annual ERAI estimates were lower than the CFSR for 2002-2007 but higher for 2011-2014.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s