Down came the rain and washed the spider out

I also wanted to call this newsletter "the lunatic in my head". The reason will soon be apparent to you

May 28, 2020

Welcome back!

For those of you who are receiving this for the first time, here is a quick preamble.

My name is Karthik Shashidhar. I’m 37 years old and I live in Bangalore, India.

For 8-9 years now, I’ve been helping companies make sense of their data - in terms of using it to solve their own strategic business problems, enhance their product offerings, discover ways to monetise their data and launch new lines of business.

Along the way, I’ve written a book on market design, written columns for Mint and Hindustan Times, taught at IIM Bangalore and done some policy research for Takshashila. To know more about me, you can follow me on Twitter, or subscribe to my blog (over 2500 posts in over 15 years), or just check me out on LinkedIn.

Through the years, as I’ve helped businesses make sense of their data, or made sense of someone else’s data for my newspaper writing, I’ve been very particular about the way information is presented, and visualised. I’ve always believed that presenting data in the right manner can deliver insights in novel ways, and a lot of my consulting experience has borne this out.

I plan to send out this newsletter once a week (every Thursday noon IST, if I can maintain the discipline). I will analyse one piece of visualisation from the mainstream media each week.

Once again, thanks for reading this. If you like it, please subscribe, comment and share with whoever else you think might like it. This newsletter will be free for the foreseeable future.

Today we will talk about a spider chart. I’m sure most of you would have come across them, even if you didn’t know the name. They are used to compare two or more entities across four or more axes. The axes are arranged radially around a central point, with the value along each axis being marked off by a point. These points are then joined together to give a spider’s web sort of picture.

If that didn’t make sense to you, maybe the visualisation of the week will.

This set of charts comes from a website called “Visual Capitalist”, which describes itself as “one of the fastest growing online publishers globally, focused on topics including markets, technology, energy and the global economy.“ (images from this website are frequently shared on Twitter and LinkedIn).

These graphs are from a piece they did back in April, on how media consumption habits had changed among different demographics in the early days of the covid-19 lockdown.

An organisation called Global Web Index conducted a survey among 4000 16-64 year old internet users in the US and in the UK, and asked them about how their media consumption habits have changed after the covid-19 crisis hit.

The respondents were split by age band to correspond to the commonly accepted “generations” in the US (I don’t know if these generations apply to the UK as well, since the cultural markers there might be different). For each kind of media, they calculated the percentage of people in each generation who have increased consumption of this media, and then drawn up a spider chart for each generation.

A few things strike you if you manage to take one careful look at the graphics:

Boomers are watching more Broadcast TV
Gen X are watching more Broadcast TV and online videos
Millennials watch more of everything except “none of these” (wait, what is this none of the above doing on this chart?)
Gen Z watches more online videos and online TV (which is apparently different from online videos)

I didn’t manage to get any more information until I started looking at these graphics more carefully for this newsletter. This is the case with most spider charts for me - I end up ignoring most of the information.

So what is it about spider charts that makes them hard to understand, yet popular with graphics designers?

Spider charts look pretty. While conveying a large number of data points at once, spider charts also quickly highlight what data points make each category different (like my summary points above). And they compress a large amount of information in a fairly concise space (one set of bar graphs for each axis would make the display monotonous and voluminous).

In this particular case, though, the last point doesn’t hold. The vertical placement of the graphics on the website means that they’re not easy to compare. So the designers have gotten around the problem by including outlines of the preceding charts on each chart. Here is a better picture of the last chart in the series. I bet you didn’t see those outlines until I pointed them out to you.

And this is not the only problem with this particular set of spider charts.

For starters, there are too many categories. Spider charts are effective for four to five axes, and no more. Beyond that, the amount of information on the chart can be overwhelming to the reader.

Then, the representation of axes labels isn’t great - apart from the number of axes, the labels being placed arbitrarily inside and outside the “heads” makes it challenging to even know what axes exist.

In fact, the heads in these graphics only seek to distract and obscure, and add no value whatsoever to the graphs. They make the “active area” of each chart much smaller than they need to be, and make comparison across charts difficult. Then again, graphics designers face the pressure to make graphics look “distinctive”, and that usually comes bundled with loss of information content.

Then, spider charts have a fundamental maths problem - the information they convey is in the area of the polygon formed by all the “web points”, and this area is highly sensitive to the order in which the points are placed (“you rearrange me till I’m sane”, or till I make sense). Randomly reordering the axes can result in a spider chart that could convey totally different information, and this makes them less effective as a means of visualisation. With four or five axes, this is less of a problem, and that makes them a bit of a lesser evil in those cases.

Finally, I’m puzzled at the treatment of “none of the above” in these graphs. I assume this represents the proportion of people in each category whose media consumption did not increase after the lockdown began. That itself, in my opinion, is an important statistic, and needs a graph by itself (maybe a simple bar graph) rather than being shoved into this massive spider chart.

Visualisation apart, there are many problems with the data analysis underlying the analysis. As mentioned earlier, it’s unclear if the generations defined for the US apply to the UK as well. Then, it’s not clear if the samples were randomly chosen. If they weren’t, the comparisons can’t be taken at face value.

While 4000 people were surveyed, it’s not clear how many belong to each bucket. Finally, there is too much information in the graphic. It is inevitable that readers will not bother about most of it. And sometimes, they may decide to not bother about any of it. And that remains a challenge with pretty-but-quirky visualisation.

That’s it for this edition. Next week, I’ll possibly go back to something from the mainstream media. If you have any suggestions on graphics to cover in next week’s edition, let me know.

Meanwhile, any kind of feedback or comment is welcome. Just reply to this email. And if you liked this, feel free to forward this to whoever else you think might like it.

Visualisations

Discussion about this post