Every year, the worst movies of the year coming out of Hollywood are “honored” with an award called the Razzie. In an industry that normally pats itself on the back at every turn, the Razzies are a nice way to recognize that not every film churned out of the Hollywood machine is worthy of praise.
In similar fashion, I thought it would be fun to award some of the worst data visualizations coming out of our collective BI industry. Although it is always fun to poke fun at data visualizations that might be lacking in usefulness, it is also an opportunity for us to learn so that we do not make the same mistakes in our own work.
Not Using Dimension Limits
This was an amazingly inept example I came across. This pie chart actually breaks a few rules.
First, a pie chart is supposed to show relationship of each slice to a whole. Sampling only the 100 most active tweeters and making that the whole in this case does not really give us real value. A simple bar chart would have been a fairer representation.
Secondly, and more obviously, how does representing all 100 slices help us? We certainly cannot see all 100 names in the legend on the right, not can we detect the differences between the slice sizes as we progress along the slices.
3D Pie Charts
Speaking of pie charts, here is another one that definitely “chaps my hide”. This pie chart is already bad just because it is in 3D. Tilting the pie to give it a 3D appearance distorts the slices, making it harder to detect how large the slices are relative to each other and to the whole.
Add to that, the slices have a high degree of transparency to make it even more difficult to see where a slice begins and ends as the colors bleed together through the depth of the chart.
Not to pile-on, but the developer also neglected to sort the slices in descending order by the metric, making this even harder to read.
Speaking of reading, forget about figuring out which label goes with which slice. It is not possible to follow the little lines to get to the correct label. Ouch…
I couldn’t resist adding this one. It only has 3 slices which surprisingly add up 193%.
3D Bubble Chart
I can admire this developer’s ingenuity. “If only the scatter plot could handle one more expression”. Making the chart 3D indeed adds another axis the developer calls the S-axis. But trying to make sense of this takes a whole lot of effort from the reader.
Also, because the chart is presented in 3 (oh wait, 4) dimensions, it is difficult to decipher if a sphere is larger because it is close to us on the Z-axis or if it is just larger, corresponding the S-axis.
Line Chart on a Non-Continuous Axis
The chart below is an example of using the wrong chart type for a data visualization. This data would have better been presented with a bar chart.
Line charts are meant to display an expression against a dimension list that is continuous, where the values relate to each other in a specific order. I generally think of a date sequence like months or years.
The card games presented here could be in any order. There is no reason to connect the points with a line.
The 3D Bar Chart
I have a special hatred for the 3D bar chart. Firstly, by nature of the chart being drawn in 3 dimensions, it is difficult to follow the height of each bar to the correct Y-value. This becomes more difficult as we get away from the axis.
But the more serious crime is that I can’t even see some of the values for the rear dimensions as they are hidden by the bars of the values in front of them. I guess those data points were not important.
The Tip of the Iceberg
Speaking of Fox News, here is another winner. This time we are looking at a bar chart representing people who have enrolled in United States government sponsored healthcare. If you do not inspect the numbers (which were thankfully printed on the chart) the bar on the right appears to pass the bar on left 3-fold. But if you take the numbers into account, it becomes more obvious that there is something wrong here. There should only be about a 15% difference from one bar to the other. Why is the visual so far off?
Well we don’t really know, but my assumption is that the creator of this chart decided to start the Y-axis at a number other than zero. We don’t really know where the Y-axis really starts here because there are no numbers on the axis. This is a very misleading chart.
The Dreaded Infographic
Hold on to your seats. This one is really bad. In this infographic of “How Baby Boomers Describe Themselves”, there are so many problems, that I don’t know where to begin.
Let’s start off by saying that the percentages given do not add up to 100%. Does this mean that the data was incorrectly calculated? Or does it mean that each person was allowed to describe themselves with more than one trait? We will never know.
Also, no matter how you look at the color areas of the chart, the area or vertical space of the colors does not seem to correspond with the numbers presented on the right.
Having the colors filling up the shape of a person does not help the situation. Because a novel shape will have varying widths from bottom to top. This makes it very hard to know what percentage each color represents.
Finally, what is the value for this particular data set of having the shape be a walking person?
As a general rule, maybe we shouldn’t use a strange shape to represent these values. It only serves to confuse and obscure the story the data is trying to tell.
The Worst of the Worst
So what is worse than using an obscure shape to represent your values? I would submit that using a visual shape that is in itself, a whole other kind of chart would be worse.
At first blush, this data appears to be related to the 50 United States. But the data actually has nothing to do with geographical analysis. The reader is supposed to read the chart from left to right as the west coast corresponding to the year 1960 and time moving forward to the east coast which represents 2060.
The second problem with this chart is that none of the percentage seem to add up to 100%. For the left and right extremes we can maybe assume that the numbers for the upper regions are simply too small to be displayed. But how do we explain the middle section? There are only three colors and the three numbers add up to 92%.
This chart should be a stacked line chart. That way we could clearly see the important parts of the chart where the lines experience real movement. Ironically, what could be the most important part of the chart (the great lakes area in the northeast) is completely missing.
The mistakes in these date visualizations are obvious and extreme. But after you point your finger and chuckle, take a step back and look at your own visualizations. I know I have made some of these mistakes on a smaller scale.