Category Archives: Chart Principles

Basic principles for effective charts, drawing on guidance from Tufte, Cleveland, Robbins, Few.

SAS – JMP 8’s Interactive Panel Chart Video

One of my ProcessTrends.Com readers, Andrew, sent me a link to a SAS – JMP 8 video on panel charts. I checked it out and was impressed.

Here’s the link.  The 1 min 41 sec  video gives you a good idea of what JMP can do to interactively make panel charts.  If you are an Excel user, be sure to check it out to see what capabilities you are missing.

What do you think?  Any chance that Microsoft will see the light and make Excel a real analysis tool?



Step Charts: R is Easier Than Excel

In this post, I show how to make a Step Chart with R.  The chart also includes a lowess smoother and annotation.  Readers can visit my site to see how to make a step chart in Excel. Continue reading

Using R to Enhance an Excel Chart

In this post I show how to add a loess fit from R to an Excel chart. Continue reading

Dot Plots versus Stacked Bar Charts – Update 3

This post includes three updates. I have added to the original material to show sequence of my thinking in the chart development based on reader input and charts. Continue reading

Is A Poor Chart Worse Than No Chart?

Kaiser at  Junk Charts posted on a New York Times chart from the Sept. 6, 2008 op-ed piece “Let’s Talk About Sex“. Kaiser and his commenters agreed that the chart was not effective.

The chart, partially reproduced to the right, shows 4 teenage sex indicators for 28 countries. The chart designer chose to use bubble size to compare the other country rates to the USA rate. The result is a confusing chart.

I have several concerns with this chart: 

1. Relationship to the article – I read the article to see how the chart fit into the writer’s discussion.  To my surprise, the writer did not mention the chart at all.  The chart stands on its own, with no relationship to the article.

2.  Chart Design – Using bubble size to compare rates is a poor charting technique. I created this dot plot of 1970 and 1998 teenage birthrates as an alternative to the Times chart.

3. Missing Data Analysis – Charts, an important tool in data analysis, are not the same as data analysis. We need to interpret, evaluate, synthesize our data to gain understanding. The Times’ article and chart do not provide any interpretation, analysis or synthesis. Why have a chart if we  are going to ignore it?

There are a number of important findings in the data that the author could have pointed out:

  • All countries except Ireland had a decrease in teenage birthrate from 1970 to 1998
  • 1998 teenage birthrates varied from a low of 4.6 births per 1,000 woman in Japan to a high of 52.1in the US
  • The US 1998 teenage birthrate was nearly 70% greater than the closest countries of New Zealand and Britain 
My blog post title asks the question – “Is a poor chart worse than  no chart?“.  The New York Times’ article was not improved by the graphic. Since the graphic was so poor, it likely took focus from the author’s words for many readers without providing any insight into the issue being discussed. In this case, the poor chart was definitely worse than no chart.

Data Loss Aversion

I’m joining an on-going discussion chain by four data – chart blogs that I follow.  Andrew Gellman started the chain with a post on The Monkey Cage about a New York Times article on data visualizations sites like Many Eyes.  Andrew pointed out that the NYT example chart “.. is just horrible”.  “It’s a classic example of a graph that looks cool but is just confusing.”

Kaiser Fung of JunkCharts followed up with a post on Loss Aversion raising concerns about “cramming as much data into the chart as possible“. Kaiser points out that this tendency is “..taking Tuft’s concept of maximizing data-ink ratio to the extreme.” In discussing the original NYT graph, Kaiser says “.. Every piece of data is given equal footing, which results in nothing standing out.”

Jorge Camoes, following up on Kaiser’s post, points to a Tufte corollary “..To clarify, add detail” , which supports the loss aversion tendency. Jorge shows an example  chart with nine time series and asks “does it make any sense to add those nine series to a single chart?

Andreas Lipphardt of XLCubed followed up Jorge’s question on how to best show this chart data by adding an elegant set of grouped colors. 

Does Andreas’s color coding solve the readability issue? No! While it helps, it does not significantly clarify the data.  We need to rethink our chart; what are we trying to show? There are three factors in this data set: year, income class and % of households in the class.  What are we most interested in? Do we really need to show the data for each year, aren’t we more interested in the long term trend?  

To me, the most important information is the long term shift in income distribution by income group, not the year to year changes. Lets use a dot plot and directly compare 1967 and 2005 distributions.

The dot plot clarifies the situation by showing changes in income by class for just 2 years so that we can compare changes by class. The % of households in the top 3 income classes were much higher in 2005, the $50-74,900 class stayed the same and % of households with total income less than $49,900 decreased.

In this case, changing chart type improved the chart more than enhanced color coding. We need to make sure we select the most appropriate chart before we try to optimize the chart format.

Kaiser’s Loss Aversion concerns raise an important charting prinicple, clarity in our chart purpose is critical to making an effective chart.  More data or better colors won’t help a poor chart type selection.


Source data file link.