Many designers choose to export data from Excel into Adobe Illustrator’s charting tool. A few months ago, we found ourselves scratching our heads over a request that we received. The internal client had lots and lots of data on personal income and wanted to show this data as a scatter plot graphic, divided into quintiles. They had already created a basic version of this graphic in Excel, but needed the designers to redesign it to make it easier to understand and to add the visual polish and presentation for which design software is better suited.
Goals, approach and tools
We took a look and determined that our biggest task was simply learning how to create a scatterplot graphic that could handle the large volume of data the original Excel file contained–1,043 rows of data.
Our goal: Show intensity and data patterns across five categories (quintiles). Keep the data “live” in order to be able to quickly update the graphic with new data.
Our approach: Use Illustrator’s scatter plot tool to graph the data. Customize the graph to create a heat map in order to show intensity/concentration of data.
Our tools: Adobe Illustrator’s graph creation tool and Illustrator’s transparency settings to create a heat map
An explanation of the final product: a heat map produced using Illustrator’s scatter plot graph tool
Take a look at the graph below. I recreated something similar to that which my team designed (update: because we haven’t yet published the data, this graph shows widgets instead of the subject matter of the original graphic). This hypothetical graph shows costs of production (money) spread out across four categories (quartiles)–a bottom quartile, a second quartile, a third quartile and the top (fourth) quartile. The darker the color (the heat map effect) the greater the intensity of those data. In other words, where the color is darkest represents a large number of widgets that with that cost of production. Where it is lightest represents a smaller number of widgets with that cost of production.
Needless to say, we learned a lot about Illustrator’s scatter plot capabilities in six hours.
What you need to know before you begin this tutorial
Before I begin, I’m assuming a basic level of understanding with Illustrator (we were using CS5, but I believe all CS levels should work for this example) and its graph creation tool. If you’re not, search for Illustrator graphs and you’ll find plenty of tutorials. Better yet, FlowingData has a good basic tutorial on Illustrator graphs here. And so does Adobe. If you’re a designer, you probably already know the basics.
Although I’m also assuming that you know what a scatter plot graph and what a heat map is, this tutorial will explain a bit about its uses and compare it to a line graph, albeit briefly.
To better explain all of this, let’s first start by building a more basic graph, a scatter plot graph.
As you can see from the example above, the graph contains four category rows (labeled “Category 1,” “Category 2,” etc. In the real world, these categories could be years, quartiles or however you wish to divide up your data. Each category shows a row of data points associated with it (Category 1 shows a gap in the values between 6 and 15, for example). Keep an eye on Category 4’s outlier, the number 20 in the top right. More on that later.
How to correctly import data into Illustrator’s data tool
Half the battle is learning how to enter or import this data into Illustrator. Essentially, think of your data as a series of columns that alternate. The first column has your “Y” axis values (your categories); the second column has your “X” axis values (the data associated with each category).
In my example, for Category 1 to appear first on my “Y” axis, I entered“1” in the first column (repeating “1” as many times as I had data for that category). In the second column I entered all the data associated with Category 1. Repeat, and you’re all set.
Take a look at these first two columns in the data table to see how simple this is: [FIGURE 1].
Data table for a scatter plot graph: FIGURE 1
Customizing your graph by using Illustrator’s “Graph Type” feature
As a final step, once you are finished working in the data table, click the checkmark button in the top right corner to output the graph. Then right-click on the graph itself and select “Type” from the menu. From the resulting “Graph options” dropdown in the dialog box, select the “Value axis.” In that dialog box, make sure that “Override Calculated Values” is checked. This is how to format your values for those fields:
- Minimum value: should always be set to zero
- Maximum value: should match the number of your categories (in the basic example shown in Figure 1, I had four categories, so I entered “4”)
- Divisions value: this is how the categories will be divided up. I always find this one intuitive, though difficult, to explain. For this tutorial make sure that the number you enter is one less than the total the number of categories that you have (I had 4 categories, so I entered a “3”).
Basic scatter plot graphic in Illustrator: FIGURE 2
Here’s the resulting graphic that Illustrator will produce at this point (I added the color manually). The blue row in the graphic represents column 2 in the data table. Remember: the reason Illustrator knows to put those blue points under the row labeled “1” is because you labeled them as 1s in column 1 of the data table. You can later change the name of that row from “1” to “Category 1” (as an example) in the graphic itself. [FIGURE 2]
[Aside] How a scatter plot graph is different than a line graph: FIGURE 3
As an aside, if you’re not familiar with scatter plot graphs, here’s a quick explanation of how to interpret this one. Take a look at category 4 (it’s the green square in the top right of the graph shown in Figure 2, in the data table in Figure 1 it is the last column). Do you see how Category 4 has ten values, each numbered as 20?
But on the scatter plot graph, you only see the number 20 represented once (green square). Scatter plot graphs won’t show you data points when they overlap exactly–a line graph will, however. Here’s the same data in a line graph. Category 3 (green) now shows you each data point that is numbered as 20. [FIGURE 3]
Customizing the scatter plot graph: FIGURE 4
Back to the scatter plot graph, you can customize the labels in the graphic itself once you’re finished with the data view. For example, you can change the category numbers from 1,2,3 and 4 to specific category values that reflect how the data is actually organized (e.g., by quartile, by year, etc.). More importantly, you can customize further with fonts, colors, stroke widths, etc., some of which Illustrator will retain if you return to data view and change the data. Which ones, you might ask? That’s a post for another day. [FIGURE 4]
So, in the real world, what can Illustrator do for you? FIGURE 5
You can create a graphic that looks like this (this is not real data, of course): [FIGURE 5]
How to set up 1,043 rows of data: FIGURE 6
Here’s a look at the data. You’ll notice that the setup is identical to the basics that I outlined earlier. The figure below shows you a snapshot of how each row is set up. [FIGURE 6]
Turning a scatter plot into a heat map: Using transparency to further customize Illustrator’s scatter plot graph to create a heat map: FIGURE 7
I promised you a heat map, and here it is. Remember that a heat map essentially shows areas of concentration (or lack thereof) in data–intensity.
To show intensity for those data points that overlap (like the repeated series of 20s that I mentioned in Figure 2), simply select *all* the points in a category with your direct selection tool. (If you’re familiar with the “Select Similar” feature in illustrator, use your direct selection tool to choose just one data point, then choose use the “select similar” feature to automatically select all of the points in that row.) Then apply transparency to them all at once. Because transparency has a cumulative effect when layered on top of something else that is transparent, you are essentially creating a heat map effect. [FIGURE 7]
A simple explanation of transparency: FIGURE 8
Well, that’s it. Please let me know if I’ve left anything out. Remember, this tutorial is meant to show how to customize Illustrator’s scatter plot graph tool. Just because you can, doesn’t mean you should! Once we publish the actual graphic, I’ll post that as well.