**Phil Jones Statement (February , 2010)**

Phil Jones of the Climate Research Unit (CRU) responded to a series of questions from the BBC in early February, 2010 (link). Question B dealt with global warming trends in the 1995 – 2009 period. Here’s the BBC question and Phil Jones answer:

**BBC Question B:** B – “*Do you agree that from 1995 to the present there has been no statistically-significant global warming?”*

**Phil Jones Answer**: “*Yes, but only just. I also calculated the trend for the period 1995 to 2009. This trend (0.12C per decade) is positive, but not significant at the 95% significance level. The positive trend is quite close to the significance level. Achieving statistical significance in scientific terms is much more likely for longer periods, and much less likely for shorter periods.”*

Phil Jones’ statement provided a time series regression learning moment for many of us citizen climate observers who quickly checked his statement with our Excel, R or other handy regression analysis tools. I sure did. Two readers, J and S, contacted me with questions – comments:

**J***: “**Hi Kelly,I’ve got some really basic statistics questions – do you mind if I ask you? Forgive me if they seem rudimentary – it’s been 20 years since I did statistics at …** ** *

*I’m looking at the recent Phil Jones statement of no statistically significant warming since 1995. So I was trying to reproduce his results and used the LINEST function in Excel on the HadCRUT temperature data. The LINEST function gave me a slope of 0.11 with an uncertainty of 0.058. Is the uncertainty value from LINEST equal to one standard deviation? If so, would that mean the 95% confidence interval = ± 1.96 * standard deviation = ± 0.114*

* So the 95% confidence interval of the slope is 0.11 ± 0.114 hence it’s not a statistically significant trend.*

* Is this all correct? Just want to make sure I’ve got my figures correct. Thanks, J”*

**S***: “**Kelly, Thanks for your site. I have been learning R using your site (and other sources). *

*I am somewhat confused about the statistical significance issue. It is claimed that Hadley (and RSS and UAH) temperature data are not (quite) statistically significant from 1995.*

* My understanding is that statistical significance largely involves the p value. My R script of Hadley data from 1995 gives a p value of 5.538e-07. This is statistically significant at the 99% level! Am I misunderstanding statistical significance? Or Is there something wrong with my script … Cheers, S”*

I’ve written this tutorial to make sure that I fully understand how to properly conduct time series regression of temperature anomaly data. I also hope that it helps J, S and any other interested readers who need a little information and/or refresher on time series regression.

**Working With Hadley Anomaly Time Series Data**

Let’s retrieve and look at the Hadley temperature anomaly data to see if we can reproduce Phil Jones results. I regularly update a consolidated monthly temperature anomaly data file for the 5 major series on my ProcessTrends website. You can download the data file and follow along in Excel or R if you’d like.

Here’s my R script that reads in the monthly temperature anomaly data.

# Retrieve temperature anomaly data and print head() to console options(digits = 6) l<- "http://processtrends.com/Files/RClimate_consol_temp_anom_latest.csv" df<- read.csv(link) head(df)

Here’s the first 6 lines of the df data.frame. Notice the 5 temperature anomaly data series start in 1880. The satellite based RSS and UAH series are NA until 1979.

Let’s subset the data to the 1995-2009 period for just the HAD data series.

## subset HAD series for 1995 - 2009 period df <- subset(df, df$yr_frac > 1995 & df$yr_frac < 2010 ) HAD_df <- df[,c(1,4)] head(HAD_df)

Here’s the first 6 lines of the HAD_df data.frame.

Now we are ready to generate an ordinary least squares (OLS) linear model of the HAD anomaly data and look at the regression summary(ols_mod).

## Create OLS Linear model ols_mod <- lm(HAD ~ yr_frac, data = HAD_df) summary(ols_mod)

Here’s our summary(ols_mod).

A few key points on summary(ols_mod):

- HAD = -21.01643 + 0.01069 * yr_frac
- Adjusted R-squared reported to be 0.116
- Intercept and slope t values indicate that they are statistically significant @ 0.01 level
- F-statistic indicates that results are statistically significant @ 0.01 level

Let’s plot the HAD anomalies and the OLS trend line to see how they look.

Title <- "Hadley Temperature Anomaly Trend Chart and\n OLS Linear Regression: 1995-2009" plot(HAD ~ yr_frac, data = HAD_df, type="l", col = "grey", main = Title, cex.main=0.85, ylab ="HAD Anomnaly - C", xlab="", las=1) abline(ols_mod, col = "red")

Here’s our 1995 – 2009 trend chart with OLS regression line.

Everything looks fine, right? The OLS model says that the 1995-2009 Hadley temperature anomaly trend of 0.01069 is statistically significant. Why does Phil Jones say ** “.. trend (0.12C per decade) is positive, but not significant at the 95% significance level**“? Readers please note that I am not sure exactly which data set Phil Jones used, I have used the HADCRUT3 nh+sh data set. Our results are close to his, but not exactly the same.

Before we do a diagnosis of the OLS linear regression model, let’s briefly review what is well known about using OLS for time series data.

**Brief Review of OLS and Time Series Data Literature**

George Udny Yule introduced the concept of nonsense correlation in his 1926 paper, ** “Why Do We Sometimes Get Nonsense Correlations between Time-series?”** published in the

*Journal of the Royal Statistical Society*89, 1–64. PDF available at this link.

Granger & Newbold, in their1974 ** Spurious Regressions in Econometrics** article (link) , said:

*“It is very common to see reported in applied econometric literature time series regression equations with an apparently high degree of fit, as measured by the coefficient of multiple correlation R2 or the corrected coefficient R2, but with an extremely low value for the Durbin-Watson statistic. We find it very curious that whereas virtually every textbook on econometric methodology contains explicit warnings of the dangers of autocorrelated errors..”*

** **

Granger and Newbold go on to discuss the consequences of autocorrelated errors in regression analysis:

*Three major consequences of autocorrelated errors in regression analysis :
(i) Estimates of the regression coefficients are inefficient.*

(ii) Forecasts based on the regression equations are sub-optimal.

(iii) The usual significance tests on the coefficients are invalid.

(ii) Forecasts based on the regression equations are sub-optimal.

(iii) The usual significance tests on the coefficients are invalid.

Is OLS suitable for times series data? Here’s what SAS (major statatisitcal software vendor) has to say:

*“Ordinary regression analysis is based on several statistical assumptions. One key assumption is that the errors are independent of each other. However, with time series data, the ordinary regression residuals usually are correlated over time. It is not desirable to use ordinary regression analysis for time series data since the assumptions on which the classical linear regression model is based will usually be violated.” *Source: **SAS** – link

This is not new information, statisticians and econometricians have known and written about this for years. While it may be new to some citizen climate observers, we need to learn or remember that R and/or Excel will give OLS results for time series analysis that may look better than they actually are because the underlying OLS assumptions are violated.

**Diagnosis of OLS_Mod Results**

Now that we have been warned about autocorrelation, let’s see how our HAD ols_mod holds up to regression diagnosis.

As a first diagnosis step, let’s look at the ols_mod residuals.

plot(ols_mod$residuals, pch=16, col = "grey", cex=1, las=1, main="Hadley Anomoly: 1995-2009 OLS Model Residuals", ylab = "Residual") abline(h=0)

Here’s our residuals plot.

These residuals do not look independent of each other, we see a number of runs of consecutive residuals on the same side of the 0 line. This tells me the temperature anomalies tend to persist. The t+1 anomaly tends to stay on the same side of the 0 line as the t anomaly. This raises a red flag for me.

Let’s use R’s acf() function to look at the autocorrelation function of the HAD anomaly data series.

acf(HAD_df$HAD, las=1)

Here’s our acf() report.

The acf() shows the correlation coefficient (r) between values t & t-lag. The dashed blue horizontal lines correspond to the 95% confidence limits. This plot shows that the Hadley anomaly observations are autocorrelated for lags of 1-10 and 20-23.

We can also run a Durbin Watson test which gives us these results:

# Install the car library if you don't already have it installed library(car) durbin.watson(ols_mod, max.lag=12)

Here’s our durbin.watson() report.

Clearly our ols_mod contains autocorrelation errors, meaning that the “..* usual significance tests on the coefficients are invalid”, *as Granger & Newbold said in 1974.

There is an R package that tests underlying OLS Linear Model Assumptions (gvlma). Let’s use this package to assess how our ols_mod meets the underlying OLS assumptions.

# You'll need to install the gvlma package if you don't have it installed library(gvlma) gvmodel <- gvlma(ols_mod) summary(gvmodel)

Here’s our gvlma() report.

The gvlma() results are very clear, the ols_mod regression does not meet the underlying OLS assumptions! Even though our OLS regression looks good, we have to conclude that the Hadley anomaly series for 1995-2009 is serially correlated, meaning that we can not rely on the OLS regression significance tests.

My take home message from this example, **don’t use OLS for climate time series data**!

In my next post in this series, I’ll show how to use a generalized least square (gls) regression for this autocorrelated time series data.

As you mention be careful about auto-correlation. Also, OLS is usually a good model.

Here is a much simpler Weather Anomaly Detection

Should the call to acf use ols_mod$residuals as it’s argument instead of HAD_df$HAD?

Otherwise the long lag autocorrelations go through the roof when you use a period long enough to show a significant trend. If you do acf on the residuals, then the results are somewhat similar for any reasonably linear period.

Thanks for the great analysis of OLS with time-series data; I look forward to the next installment.

I would like to suggest that the second plot should be replaced with a process behavior chart of the residuals (individuals and moving range), which would provide tests for trends and non-homogeneity in the data. You are already using the residuals chart to eyeball the randomness; the ImR chart would simply provide a clearer gauge of (non)randomess. Looking at the data, it’s clear that the residuals would at least fail the Western Electric rule 4 and probably rule 3 (nine consecutive points on the same side of the mean and probably four out of five points > 1 sigma from the mean, respectively).

Following along using the R code provided, my plot of the trend line did not look the same as the one provided in the post. Then I noticed that, despite the figure heading indicating the range is 1995-2009, the posted graph has the full range of data from 1880.

I assume you have linked to the wrong .png file?

Terry

Thank you for catching that. I’ve corrected the link, it now show the 1995-2009 data, not the entire data series.

Kelly,

Thanks for your clear analysis on this topic, which has purplexed me for some time.

I am looking forward to your next post.

Remember the autocorrelation is in the errors, not the original data points.

You see the problem with the model in the first plot – it’s clear that a linear trend is *not* a good fit to this data. Why worry about the significance of a clearly incorrect model?