Monday, 10 March 2014

Data Visualization Mistakes

Data visualization simply means the creation and analysis of data represented visually. Data abstraction and representation is done in various forms with several attributes and variables serving as units of information. Data visualization is considered the best way to understand any given data. However, there are some mistakes that people need to avoid in order to more efficiently understand data.

1. Error in Chart Percentages

It’s common to employ pie charts and tables for the purpose of visual representation. While including charts in data representation is not difficult, more often than not users can trip and make mistakes. One of the most common mistakes with pie charts is to divide them into percentages that simply do not add up. The basic rule of a pie chart is that the sum of all percentages included should be 100%. Consider this example:

Example of Error in Chart Percentages

The percentages not only fall short of 100%, the segment sizes also do not match their values. This can happen because of various reasons such as rounding error or a miscalculated percentage. This can also happen when non-mutually exclusive categories are plotted on the same chart. Unless the included categories are mutually exclusive, their percentage cannot be plotted separately using the same chart. For such categories, it is better to make use of separate charts.

2. Ambiguous Representation of Data

Ambiguity is not always intentional and can creep into data visualization quite often. It is important that you use accurate data in plotting the graphs but it is equally important that you avoid the use of too exotic graphs. Such graphs usually result in diverting the attention of the reader from the actual data. Use the attributes of color, brightness and saturation only where they are needed. Efficient use of labels and other marks is also useful to clarify different aspects of the data.

Example of Ambiguous Representation of Data

The chart represented above should have less saturation and brightness to make it more comprehensible and clear.

3. Displaying Too Much Data

People are usually looking for specific information when they are scanning through a data visualization document. So it is very important that only relevant, specific and concrete data is represented while leaving out anything that is irrelevant. Presence of irrelevant data, whether in the form of tables or charts, makes finding the required information difficult. This is also related to the cluttering in graphs. It is always better to use several graphs to represent related quantities than putting them all into a single graph and cluttering it.

A few simple and easy to read graphs are always better than one complicated and cluttered representation of data. Similarly, the choice between a bar chart and a pie chart can also affect the clarity of the representation.


The data is clearly congested and it would have been better to represent it in the form of a bar chart. A bar chart would also allow comparison between different units.

4. Consistency of Data Visualization

One of the most common mistakes in data visualization is to represent data using various kinds of visualizations. Good practice is always to stick with a particular kind of data visualization technique and retain it to the end. With different visualization techniques applied at the same time, a reader needs to comprehend each part differently before moving on to the next one. This can result in loss of data. In order to make the audience understand the information more efficiently, it is better to keep the visualizations consistent.

5. Keep It Simple

The most important lesson for data visualization is, just like everything else, not to let go of simplicity. It’s natural to feel that a more embellished or artistic representation would result in more clarity but, more often than not, it does not. Besides, this practice also results in distracting the people from the actual data. For instance, look at the example below.

Keep It Simple

There are several ambiguous things in the chart. It is not clear why the first image is blue and the rest are red. Further, the number in the second image is against the paintbrush and not against the head while in all other columns it is against the head. But a user might just appreciate different figures and think about the real-life characters represented by them and move on without understanding the data. The importance of visual representation of data has increased with the advent of mobile technology because of easy access to the internet. However, it is very important that visual representation of data is free of the pitfalls that make data representation ambiguous and irrelevant.