Showing Change Points in a Trend Chart with R

In this post, I show how to add change points to a trend chart with R. Readers can compare my R and Excel – VBA solutions for the same chart to compare R and Excel VBA charting programming.

Introduction

My goal is to be ambidextrous in both Excel and R so that I can use the better tool for the charting job at hand rather than force a workaround solution to my problem.  On the Excel Chart Doctor page of my ProcessTrends.com site, I have a number of advanced Excel charts, including this one showing El Nino and La Nina events on a long term temperature anomaly chart.

excel_version

I used a VBA procedure to add the rectangles for the change points. The Excel Chart Doctor video explains the change point concept and shows my Excel-VBA procedure in action.

In this post,  I make a similar chart with R.

Change Points on a Trend Chart

El Nino events, wars and recessions are examples of categorical conditions that may have an influence on time series data. We can show these conditions on our chart by inserting a rectangular shape to cover the chart area from the begining to the end of the period . 

If we call the start date of each new period the change point, then we can build a series of rectangles by just knowing the change point dates. change_pts The sketch on the right shows several change points for our El Nino example. We consider the start date of the El Nino as a change point and the the end of the event as the start of another event period, nuetral or La Nina. This way we can establish event periods as the time between the change point dates. By recording the start date and code for each event, we can track and plot changing periods on our time series chart. 

We can assign a color to each category rectangle so we can distinguish them.

R Trend Chart with Change Points

Here’s my R version of the GISS Temperature Anomaly and ENSO Condition chart.

r_trend_chart_w_chg_pts1

The R version includes the 1950 – 2008 time period. The pink color periods are El Nino events, the light blue periods are La Nina events and the white background periods reflect neutral conditions. 

The rectangles seem to work quite well. 

Data Files and R Script File

Readers can download the R script and 2 data files on my ProcessTrends site at this link. If you try the script, be sure to change the source data file path to the actual folder where you saved the data files.


About these ads

6 Responses to Showing Change Points in a Trend Chart with R

  1. Pingback: Decadal Trend Rates in Global Temperature « Charts & Graphs

  2. Pingback: R Works With Factors « Charts & Graphs

  3. Pingback: R Works With Factors « Charts & Graphs

  4. DaveT said: Shouldn’t the Little Boy be blue and the Little Girl pink?

    El Nino events have a reputation of raising global temperatures and La Nina events have a reputation of cooling temps. I selected the pink and blue based on their reported temp affects not baby gender. Shows how complicated color selection can be!

    DaveT had to e-mail me his script in a text file because WordPress and other software confuse R’s <- symbol for an html tag.

    Here’s DaveT’s script addition with the tags.

    GISS_Data$my_dt<-as.Date(GISS_Data$my_dt,”%m/%d/%Y”)
    my_Data$CP<-as.Date(my_Data$CP,”%m/%d/%Y”)
    for (i in 1:periods) {
    G my_Data[i,1] &
    GISS_Data$my_dt < my_Data[i+1,1])
    points(G$my_dt,G$AnomC,type=”l”,
    col=cc[as.numeric(my_Data[i,2])+2])
    }

    Thanks DaveT. Your script works quit well.

  5. well, that doesn’t look right.
    I guess pasting the code into your comment box didn’t work so well.
    I’ll try email.

  6. Kelly,
    Shouldn’t the Little Boy be blue and the Little Girl pink?

    Anyway.

    It appears from the chart that the temperature anomaly generally rises during El Nino and falls during La Nina. But it is hard to tell clearly since almost 60 years are so compressed. I’d widen the chart some.
    Another way to help show this which is easy in R is to color-code the curve according to the event.

    The following code at the end of your script ought to do this. (By the way, I needed to make par(tcl=NA) without quotes on line 5 of the script.)

    cc=c(“blue”,”grey40″,”red”)
    for (i in 1:periods) {
    G as.Date(my_Data[i,1], “%m/%d/%Y”) & as.Date(GISS_Data$my_dt, “%m/%d/%Y”)<as.Date(my_Data[i+1,1], “%m/%d/%Y”))
    points(as.Date(G$my_dt, “%m/%d/%Y”),G$AnomC,type=”l”,col=cc[as.numeric(my_Data[i,2])+2])
    }

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s