|
How Accurate Is Eyetracking?
By Colin Johnson, President & CEO, Eyetools
Inc.
"How accurate are the findings from this Eyetrack III research?"
There are three ways to address this question depending on the degree of abstraction with which you're thinking about this study: at the level of the individual fixation, at the level of the individual page, and at the level of a test participant's overall experience.
1) The first way to answer this question focuses on the equipment and the most granular view of the data. Are the data and findings from the equipment accurate?
To this question the answer is yes, the findings are accurate. An eyetracker can capture and report what people are looking at to within a centimeter's distance when you measure a person's point of gaze on a typical computer screen. The manufacturers of eyetracking devices report this accuracy in units called "visual angle." The Tobii eyetracker that we used in this study is accurate to within 1 centimeter of visual angle. This means that if you draw an imaginary line from a person's pupil to the item she is viewing, the machine will report what it thinks she is looking at within 1 degree of that imaginary line.
To explain it in geometric terms, the line between the eye and the point of gaze is the long leg of a right triangle. The short leg of the right triangle is the distance between the true point of gaze and where the eyetracker thinks the person is looking. The visual angle is measured between the long leg of the right triangle and the hypotenuse of the right triangle drawn by the joining of the long leg and the short leg. When the long leg is about 50 centimeters (approximately 2 feet), a typical distance between a computer monitor and a person's eye, one degree of visual angle drawn between the long leg of the triangle and the hypotenuse translates into a short leg of the triangle that is about one centimeter long.
When you consider that the human eye can perceive the immediate surrounding of its point of gaze through its peripheral vision, the 1 degree of error from the eyetracker is lost in the noise of how the human eye works anyway. When Eyetools displays individual viewing behavior, it has developed a patented visualization that incorporates a region around the fixation point that looks like a halo. This halo illustrates peripheral vision. As the distance increases away from the point of gaze, the acuity gradient or "halo" becomes dimmer. This signifies how the acuity of our peripheral vision degrades the farther away one measures from the point of gaze.
Think about it yourself. As you are staring at this word
on the screen, you can see that there are other words on the
screen. Without moving your eyes, you can probably make out
the couple words in front of and the couple words behind the
word you're reading right now. You can probably also make
out the words on the line above, or even the line above the
line above this one. The same holds for the lines below. Chances
are, however, that you can't make out specific words that
are a couple paragraphs away. Now, without moving your eyes,
try to become aware of the computer screen and the desk. How
about the room at the periphery of your field of vision?
So you see, at the individual level, the data accurately reflects what people see on a screen. While there is a margin of error of plus or minus 1 degree of visual angle, this error falls within the margin of error of the natural function of the human eye. (As an example, that is to say that it is completely natural for people to focus just above or just below the line of text that they are actually reading.) To further compensate for this, the representation of the data also reflects this so that one can intuitively appreciate the various options that might be the focus of a person's gaze given a fixation point drawn on a page.
2) The second way to answer this question is to think not about the accuracy of the individual fixations but rather to contemplate the accuracy of displaying the viewing behavior of groups of people upon a single page. Let's explore in greater detail how the aggregate images or "heatmaps" are generated and what they represent.
The aggregate image is another patented visualization from Eyetools. The basic aggregate image represents the viewing of a specified group of individuals on a given page. The warmer colors at specific points on the page signify that a larger percentage of the group viewed those points. One specifies whose data to include in a given aggregate image, and then the Eyetools Analysis Solution calculates all the relative percentages and displays them in their respective colors.
The key to the accuracy of the aggregate images is that they account for the acuity gradient in the same way that the individual images represent each test participant's peripheral vision. That means that the aggregate images represent where the participants of the group fixate as well as the areas of the page that could potentially have been perceived through the participants' peripheral vision as well. When an aggregate image includes data from eight people or more, the smoothing of these relative percentages of acuity gives a representative display of all the areas on the page where the individuals' viewing patterns overlapped. Through the algorithms used to generate this data visualization, the portions of the page that only a minority of the group viewed are de-emphasized. The warm colors indicate over-arching consistency in viewing patterns across a group. Viewing behavior represented by cool colors is not as consistent.
While these aggregate images do not tell the whole story of each participant's visual interaction with each page, it does give an accurate representation of which pixel by pixel region draws the attention of the group.
Now that we've explained the mechanics of the aggregate
image, let's shift gears back to how these images are used
in analyzing design performance. To do this, it is first important
to highlight a phenomenon that few people realize exists.
There are distinct differences between how a person views
a page when seeking to evaluate it and how a person views
a page when attempting to use it. Those people who are responsible
for the design or content of a page give the page serious
scrutiny and often virtually commit to memory every single
pixel of that page. When they think about how their target
user will experience the page, it is difficult to imagine
what the experience is of a visitor who comes to that page
only once or twice, and if that much, only briefly. It can
be difficult for design professionals to recognize that the
infrequent visitor to the page only sees a limited amount
of information on the page and consequently is able to commit
only a small fraction of that information to active memory.
If it is not in the data, it is not in the minds of your customers. The aggregate image is helpful for these professionals because it gives a single-glance depiction of what information they can assume visitors have in their minds after visiting a page. It is critical to emphasize that if the data says that people did not look at a given area of a page, then it's true. They really did NOT look at that area. The data represents what visitors focus on AND what they might perceive through their peripheral vision. If they didn't look, that area is not in their active memory. Because it is not in their active memory, they do not consider it when deciding what to do next or when they determine if they got the information that they require. The technical term for what people have in their minds and can act from is called their "consideration set."
A case that illustrates this point was an independent trial that Eyetools researchers conducted in January 2001 on an early version of the E*TRADE website. The center section of the homepage at that time featured promotional offers designed to entice prospective customers to sign up. The researchers selected a group of six participants who fit the profile of potential customers of online brokerage services and asked each to visit the E*TRADE site and evaluate whether or not they might consider signing up for the service. The researchers tracked each participant's eye movements and then analyzed the data. Much to their surprise, they discovered that virtually none of the test participants more than glanced at the very promotions designed to draw them in and entice them to sign up. These promotions were virtually ignored completely. The researchers wanted to determine if the data could predict what future users to the site might include in their consideration set.
To test this, they cached the homepage, converted the text to complete gibberish on all the areas of the page where they determined that visitors ignored, and then invited another six participants to repeat the same task of evaluating whether or not they might consider becoming customers. The only difference was that upon completion of the task, the researchers asked each participant if they thought there was anything weird or unusual about the site.
What they found was that not only was the viewing data essentially the same as the first round of testing, but that not one of the participants found anything odd about the site. "No, no, the site was great. No problem at all. Everything seemed fine."
The researchers then tested the site against a much larger
sample of participants and found the same exact reaction.
Only one person in 20 identified that there was anything strange
about the site, suggesting that the visual presentation had
failed to achieve the primary purpose of the homepage -- to
draw in new customers.
This example illustrates how important it is to understand what users include in their consideration set, and then to proactively modify the visual presentation of underperforming pages if the data suggests that a page's design is not performing as effectively as it could. The aggregate image represents this well and has been proven to be an accurate way to highlight potential problem spots on a page.
3) The third way to address this question is to consider whether or not the findings from eyetracking analysis are accurate representations of the overall experience of one's users. This is one of the most exciting areas of user research and one of the elements that we hope to explore further in Eyetrack III.
Simply put, the field of eyetracking research in website
evaluation is still young. Researchers in this area are only
just beginning to uncover the connections between eye movement
and overall user experience. Until recently, the challenge
of capturing and analyzing data from users browsing websites
was nearly impossible for most researchers because no tools
existed to help them correlate the output from an eyetracker
to scrolling webpages. The Stanford/Poynter Eyetrack
2000 study (Eyetrack II) was one of the first of its kind.
When he was still a researcher at Stanford working on Eyetrack
II, Eyetools founder Greg Edwards had to create a software
application from scratch to meet this challenge because nothing
existed in the marketplace for him to use. The code from that
original software application became the foundation of the
Eyetools Analysis Solution which we used to do the data processing
and analysis for Eyetrack III.
This is what we do know. How people use their eyes is an important indicator of what people think about things, what they like and dislike, where they succeed and how they fail when using websites. Psychologists have known for years that eye movement reflects both the unique drivers in an individual's conscious thought as well as the unconscious drivers that are consistent across a population. The challenge has been to capture the right data and then efficiently segment and analyze the data in a way that uncovers those drivers. Because of this, the quality of one's analysis tools determines how much insight one can expect to glean from analyzing this data.
The team at Eyetools believes that effectively interpreted eye movement can reveal more refined insights into one's customers than virtually any other indicator. The current Eyetools Analysis Solution represents a great advance in the analytical capabilities available to researchers and practitioners. To make its adoption easier, we have designed it to seamlessly integrate into conventional user testing as an augmentation to what companies and universities are doing today.
As the solution enables ever-growing levels of analysis automation, users of the solution will find it easier and easier to use eyetracking data as a way to understand overall customer experience.
|