Frequently Asked Questions

BACKGROUND
1. What exactly is "eyetracking"?
2. Don't eyetracking subjects have to wear funny headgear with cameras, and wouldn't that alter the reading experience such that it's not very realistic?
3. Can you track people who wear eyeglasses?
4. What does an eyetracking user session image look like?
5. What's a "heatmap"?
6. Can you track time spent on a page?
7. Is this the first Eyetrack study of news websites?
8. Can you help me understand how people read?

THE EYETRACK III STUDY
1. What equipment did you use?
2. What was the testing environment like?
3. Where did testing take place?
4. Who performed the actual testing?
5. How many participants were tested in this study? What were their ages, gender, etc.?
6. How were participants in the study recruited?
7. How long was the testing period per individual?
8. What did participants do during the session?
9. What were the goals of the study?
10. Were all the data collected "good"?
11. What's with the "comprehension" testing? That's not really eyetracking, right?
12. How accurate are the findings from this research? What's the "margin of error"?
13. What's the difference between a "finding" and an "observation" as found in the various Eyetrack III reports?
14. What is the concept of "counterbalancing" and how was this incorporated in the study?
15. What does it mean to "control variables," and how was this achieved during the testing?
16. What variables did you test in Section 1 of the Eyetrack III testing?
17. How did you create and design the 10 mock news websites?
18. I'm a member of the press. Who can give me more information about Eyetrack III?


1. What exactly is "eyetracking"?

Eyetracking is research that tracks where a person's eyes look while reading, then analyzes the data to reveal patterns. By combining and reviewing data from multiple individuals during testing, you can discover representative patterns that apply to most of the population. For the Eyetrack III study we examined viewing patterns of prototype news websites, but you can use eyetracking to study how people view printed newspapers and magazines (editorial content and/or advertising), to gauge effectiveness of various forms of advertising, product packaging, and computer applications; it can be used in flight simulators, and even to track what people look at on shelves when grocery shopping.

Here's a more precise definition, from our friends at Eyetools, the company that conducted the Eyetrack III study: "Eyetracking is a monitoring technology that determines where a person is looking. Special cameras called 'eye trackers' can watch a person's eye and capture fixations and eye movements with a remarkable degree of accuracy (typically accurate to 1 cm on a standard computer screen) without requiring any special headgear."

In other words, it's like getting inside of a person's head and watching what they see -- with the advantage that a computer is recording every eye movement and fixation for later compilation and analysis.

2. Don't eyetracking subjects have to wear funny headgear with cameras, and wouldn't that alter the reading experience such that it's not very realistic?

It used to be that eyetracking research required people to wear headsets with small cameras that tracked the movement of their eyes and matched that to what they were viewing. The technology has now improved, so that in Eyetrack III our test subjects did not have to wear anything on their heads. They simply sat in a desk chair and looked at a standard size (17 inch) computer monitor.

It wasn't a typical monitor, though. Current-generation eyetrackers put a small video camera below the screen, which is calibrated and locked on to the test subject's gaze. As long as the person's head doesn't move outside of the camera's field of view (a region of space about a cubic foot -- more than enough leeway for typical usage), the eyetracker stays on target throughout the session.

The technology has gotten so good that today there exists eyetracking equipment that can use a telephoto lens and track a stationary person's gaze from 20 feet away. (We didn't use such equipment for Eyetrack III.)

3. Can you track people who wear eyeglasses?

In Eyetrack III, we chose to recruit only people who did not need to wear glasses. Eyeglasses can make it more challenging (though not necessarily impossible) to eyetrack. Contact lenses were OK. As eyetracking technology improves, we'll have less trouble with glasses and will do more testing with people who wear them. (Remember, too, that fewer people now wear glasses, thanks to the popularity of laser eye surgery.)

Older camera-headgear systems often are able to get good data from people who wear glasses. For this study, we felt that the advantage of having a more natural online reading environment outweighed the limitation. And despite this limitation, we still were able to test some subjects over the age of 50.

4. What does an eyetracking user session image look like?

Single-user session

Above is an image (click to enlarge) of a single test subject's journey through a single page (one of the homepages from the Eyetrack III test). It looks to most of us like a jumble of lines and dots, but eyetrack researchers can tell a lot from this.

Here's a graphic (click to enlarge) showing the features found on an individual session:

Individual-session explainer

Notice lots of circles connected by thin lines. The circles are points of eye fixation -- where the person halted their gaze for at least a fraction of a second. The lines are "saccades" and indicate the path that the eyes took through the page connecting each fixation. Find the green dot; that's the entry point to the page. Follow the lines (if you can) and you'll see the eye path. The image also has numbers in black boxes; these numbers are timestamps and can help you follow the path more clearly.

You'll also note blue lines on some parts of the image. This represents the general trend of viewing over the page. The thickness of the line varies according to how much time was spent there.

The individual images show you what one person did. To see the viewing trends of an entire group, we change the visual metaphor from lines and circles to that of a "heatmap" where the warmer colors represent where a higher percentage of the group looked.

5. What's a "heatmap"?

Heatmap

Click the thumbnail image above to see an example. A heatmap is an aggregate view of all the individual user session images (like the one in question 4 above) for a single webpage on a single task. Researchers combine all the individual page sessions to create a single view of a page, revealing eye patterns from the group of test subjects.

Here's a graphic explaining the various features of a heatmap:

Heatmap explainer

The red/orange/yellow areas are where the larger percentages of the group looked (that is, where their eyes fixated for at least a fraction of a second). The dark blue areas are where the smaller percentages looked. There's a key at the top of the image to guide you. The image also has "X" marks throughout, indicating where participants clicked. (The numbers signify specific test subjects.) Also note the red horizontal lines on the page. These tell us how far down the page the test subjects went before leaving the page. In this example, the largest group scrolled all the way to the bottom of the page, but a fair number of people didn't scroll down below what was visible initially.

By looking at a heatmap image of a page, you might see that a particular ad, image, or headline hardly gets viewed, for example. You might find that the graphic that an artist spent many hours producing is hardly glanced at.

Heatmaps are especially interesting when you start analyzing them in relation to each other. Heatmap images of two differently designed homepages, for instance, can tell you which one does a better job of attracting attention to the content, the ads, or other key areas of the page. There's much you can learn simply by comparing these images from different pages.

6. Can you track time spent on a page?

Yes. (And we can track time spent on any component of the page.) Time spent on a page gets interesting when you're comparing different page designs. For example, in one part of Eyetrack III we had two nearly identical homepages, but one contained only headline links to inside stories while the other had headlines plus blurbs. The amount of time spent on each of those pages tells the researchers something. Eyetools researchers combine the heatmap images along with time spent and other factors to determine trends that you will read about in this Eyetrack III website.

7. Is this the first Eyetrack study of news websites?

This is the third Eyetrack study of news conducted by The Poynter Institute and its partners. The first studied newspaper print editions; the second studied first-generation news websites. See "Eyetrack: A History of News Consumer Behavior."

8. Can you help me understand how people read?

In understanding eyetracking results and data, it's helpful to understand the process of human reading. People typically don't look at individual letters of each word, but rather recognize words as a whole, and they often look at more than one word at once.

According to psycholinguist Keith Rayner of the University of Massachusetts at Amherst, your eyes do not move smoothly across the text as you read. Instead, the typical reader behavior is to look at a word or several words in a group, then pause your eyes there briefly; this is called a "fixation," and it takes about 0.25 seconds on average. After a fixation, you move your eyes to the next word or group of words; this movement is called a "saccade" and takes only 0.1 seconds. (People often skip over short or predictable words such as "of," "in," "a," etc.) After this pattern is repeated once or twice, you pause to comprehend the phrase you just looked at (which on average takes 0.3 to 0.5 seconds).

According to Rayner, all these fixations and saccades result in 95 percent of college-educated people reading between 200 and 400 words per minute when reading an article; 300 words per minute is the average.

As you read about Eyetrack III findings, you'll often see references to "fixations." Simply remember that a fixation is a brief but measurable pause in looking at a word, phrase, or image. "Saccades" are the paths between these fixations.


THE EYETRACK III STUDY

1. What equipment did you use?

The eyetracker used was a Tobii Technologies model ET-17. To the user, the device looks like a standard 17-inch monitor and keyboard. The only difference is a small video camera positioned at the bottom of the monitor. The test subject is not required to wear headgear of any kind, and there are no other hardware components to cause distraction. The monitor and keyboard are black.

The Tobii ET-17 eyetracker
Tobii stationary video camera

According to the company, the device will tolerate fairly large and fast head movements. This is achieved without use of moving cameras. The device is calibrated for each test subject, but once set there's no need to recalibrate -- even in the event of a break to get a cup of coffee. Tracking is resumed instantly. (This is a major improvement over eyetracking procedures that require headgear or sensors.)

2. What was the testing environment like?

Testing took place in a small office with no distractions or clutter. Test subjects sat in a comfortable office chair; the Tobii eyetracking monitor was positioned on a spartan desk. After initial instruction by an Eyetools representative, the test subjects were left on their own to work through their assigned tasks -- which took about an hour, including post-testing questioning. Test administrators monitored each participant from another room through a Web-enabled moderator interface that tracked their progress. They could call for help if they had a question or ran into a problem.

3. Where did testing take place?

All testing for Eyetrack III took place in San Francisco, California, in the offices of Eyetools Inc.

4. Who performed the actual testing?

Eyetrack laboratory testing was performed by Eyetools Inc., a San Francisco, California-based software company that specializes in eyetracking analysis solutions for these types of user studies. The company was founded by Greg Edwards, who was the chief researcher at Stanford University during the Stanford-Poynter Eyetrack II study conducted in 1999-2000.

Eyetools personnel involved in this particular project included Edwards, CEO Colin Johnson, and research manager Leslie Kues.

5. How many participants were tested in this study? What were their ages, gender, etc.?

We tested 46 people in this phase of the study. (Actually, we ran 51 through the process, but were forced to toss out five people because of bad data; the shape of some people's eyes prohibit accurate tracking.) They ranged in age from 19 to 60, and represented a cross-section of the adult, Internet-using population. To participate in the research, test subjects must have reported that they were regular Internet users and had spent time recently reading news websites.

The majority of participants were between the ages of 19 and 45, with only six older than 45. Here's the breakdown:

Age group Gender No. tested
18-24 M 5
18-24 F 5
25-34 M 5
25-34 F 6
35-54 M 10
35-54 F 10
55 & up M 2
55 & up F 3

In terms of education level, the test group broke down this way: high school, 13%; associate degree, 11%; bachelor's degree, 45%; master's degree or Ph.D., 27%. So, our test pool was better educated than the general population.

The ethnicity breakdown was about 75 percent anglo/caucasian, with other groups split among the remaining. Part of the reason for such a preponderance of whites has to do with the limits of the eyetracking technology we used. The Tobii ET-17 eyetracker we employed was not ideally suited to track a variety of ethnicities; it's more difficult to track the eyes of some ethnic groups, especially some Asians, because of differences in the shape of the eye. The latest release of the no-headgear Tobii eyetracker, the model 1750, tracks all ethnicities equally well -- and we would hope to utilize it in future studies.

We are not publishing any analysis of Eyetrack III data based on ethnicity or educational level. The numbers of participants are not large enough to be able to break the groups into even smaller categories and still have meaningful segmentations on the data that are statistically relevant.

6. How were participants in the study recruited?

Recruiters used market-research databases and placed ads on online community sites to find the participants for Eyetrack III. The recruiters then screened each person on the phone for suitable characteristics (had visited a news website recently; Internet user; not eyeglass wearer), and scheduled them for a specific test date. Each participant was paid a fee of $75 as compensation for his or her time.

7. How long was the testing period per individual?

One hour.

8. What did participants do during the session?

First, they received a brief overview of what to expect during the session. The moderators introduced them to the eyetracker and the testing interface. The interface automatically served them a pre-scripted combination of easy-to-follow tasks separated into four sections:

THE TEST WEBSITES
Here are links to the news websites* used in the study. Half the participants saw Homepage 1-5, and half saw Homepages 6-10.
  • Homepage 1
  • Homepage 2
  • Homepage 3
  • Homepage 4
  • Homepage 5
  • Homepage 6
  • Homepage 7
  • Homepage 8
  • Homepage 9
  • Homepage 10

  • * The sites were optimized for viewing on the Windows version of Internet Explorer 6.0. They do not display correctly in some other browsers.

    SECTION 1: Catching up on the news
    •Participants were asked to catch up on the news by visiting five different news sites. These sites were mock news website prototypes, populated with unique real news stories, photographs, and multimedia content. (See box at right.)
    SECTION 2: Comparison of comprehension for multimedia vs. text
    •Participants first read a control text news article, then answered a series of multiple-choice questions that nominally gauged their comprehension but which were really intended to put the participants in a frame of mind that was better suited for answering the questions of the following two features. (Answers for this control question were not tabulated.)
    •The group viewed an editorial feature that was presented either in text or in multimedia. They were told ahead of time that they would be asked questions afterward. Half of the group viewed this feature in its multimedia format, and the other half viewed the feature in its text format. Both groups were asked the same set of questions to gauge their comprehension of this first feature.
    •The group then viewed a second editorial feature that also was presented in both text or in multimedia. They were told ahead of time that they would be asked questions afterward. The group that viewed the first feature in text format viewed the second feature in multimedia format. Similarly, the group that viewed the first feature in multimedia format viewed this second feature in text format. Both groups were asked the same set of questions to gauge their comprehension of this second feature. Counterbalancing feature formats (text vs. multimedia) in this way allowed us to cancel out a bias affect that might have occurred if everyone had seen the same thing in the same order.
    SECTION 3: Multimedia editorial features
    •Participants received five minutes of discretionary time to select one or more multimedia features of their choosing from a list of eight options across a variety of topics.
    SECTION 4: Demographic questions
    •Participants concluded the session by answering a few personal demographic questions.

    9. What were the goals of the study?

    Each section of the study had distinct goals related to reading news online. Below are the goals for each section.

    SECTION 1:
    •Compare viewing across different news design elements (e.g., font size, use of blurbs, number of headlines)
    •Compare viewing across different news website styles and layouts that we modeled after current high-traffic news sites on the Web.
    •Compare viewing of different article page layouts and writing styles
    SECTION 2:
    •Compare comprehension of material in multimedia format vs. text format
    SECTION 3:
    •Gather preliminary information on how people view multimedia/interactive free-form articles

    10. Were all the data collected "good"?

    Not all, but most of it. The eyetracking equipment is fairly accurate and robust. It is designed to automatically track a person's eyes within a region of about a one-foot cube. That translates into a region large enough to track the eyes of a user who sits comfortably still in front of a 17-inch computer screen. If the person moves around more than normal or leans outside of the camera's field of view, it would stop collecting data. In a case like this where there is a lot of data loss, we drop the data for the task so that the data's absence does not negatively bias the results. In those cases, we occasionally would lose part of the session, while other parts remained fine. If a participant could not be tracked at all, we dropped the session so that it would not bias the aggregate images.

    (Eyetracking hardware is constantly evolving, with new equipment offering better performance. The no-headgear eyetracker we used was reasonably good about minimizing bad data, but subsequent revisions of that brand of eyetracker are considerably better.)

    Of the 51 people we invited to take part in the eyetracking testing, five of them proved to be not trackable, leaving us with 46 test subjects. The chart below shows the number of test subjects whose data was used on the 10 homepages that we tested as part of the study. Any conclusion that is based on data from a homepge was based on this sample size.

    Half our test subjects viewed one homepage in a set, the other half viewed the other matching homepage -- e.g., HP1 and HP6 are the same design with one variable. When comparing the controlled variable between the two homepages in a set, the researchers used data from the number of people listed here.

    Article pages were created that matched each of these five sets. Any observation or finding based on an element in an article page is generated from data from a total possible number of people included in the set. As an example, large pictures on article pages were part of the HP5/HP10 set. Any data generated came from a total group size of 36 (20 + 16).

    Findings on recall in text and multimedia were based on questionnaire responses from 44 of the 46 individuals tested. Observations on multimedia news features were based on smaller samples that are reported with each observation.

    11. What's with the "comprehension" testing? That's not really eyetracking, right?

    A major part of the Eyetrack III study involved examining how news-website readers interacted with multimedia editorial content (interactive graphics, photo slideshows, audio, video, etc.) -- the newest form of journalistic storytelling. We wanted to learn whether multimedia editorial content was comprehended as well as, better than, or worse than text articles. So we devoted a small part of the hour-long testing period to text-vs.-multimedia comprehension using multiple-choice questions. We also kept the eyetracker going during these tests, which gave us additional eyetracking data to add to our other results.

    12. How accurate are the findings from this research?

    The answers to that question are lengthy, so we've created a separate page to address it, written by Eyetools CEO Colin Johnson. Succinctly, for those who want to move on more quickly:

    •The equipment used in Eyetrack III is very accurate. It can capture and report what people are looking at within a centimeter's distance when you measure a person's point of gaze on a 17-inch computer screen.
    •The aggregate images (heatmaps) of eye fixations and movements for the group of test subjects are a more accurate representation of viewing patterns in areas of the images shown by warmer colors (where more people looked), and less so for the cooler colors (where fewer people looked). The data in the images emphasize where people focus and what they might perceive through their peripheral vision.

    13. What's the difference between a "finding" and an "observation" as found in the various Eyetrack III reports?

    As you read through the reports on this website about Eyetrack III, you'll notice that some are labeled "findings" and others are "observations." The distinction between these two is that the "findings" are more rigorous and have been appropriately tested using tightly controlled variables. "Observations" are just that, observations made by the researchers that have not been as vigorously tested.

    For example, we report "findings" on homepage headline size, because we tested two matching homepages that were different in only one way: One used small headlines, the other used larger headlines. Half of our test group saw one homepage, the other half saw the other page. The single variable allowed us to report "findings" about headline size.

    For our report on news website advertising, we didn't use such tightly controlled variables. We tested five website designs, and each design contained different ad types, sizes, and placements. We could observe interesting behaviors of our test subjects as they looked at the various ads, but these are reported as "observations," because the testing protocol was not as stringent as in the headline-size example.

    14. What is the importance of "counterbalancing" and how was this incorporated in the study?

    If a person sees the same thing numerous times, she will begin to change his or her viewing patterns because he or she grows increasingly accustomed to it. We call this phenomenon "the order effect." When we study viewing behavior in eyetracking studies, we prefer to mix up the order in which we present the stimuli so that we can cancel out that effect. When we mix up or randomize the order of the tasks, we call that "counterbalancing." If we cancel out the bias from order effect through counterbalancing and then discover that there are differences in the data, we can be certain those differences are caused by the stimuli and not by the order. We used Counterbalancing in Section 1 and Section 2 of Eyetrack III.

    15. What is the importance of controlling variables, and what does that mean for designing testing materials?

    By controlling variables, researchers can more effectively pinpoint the causes of differing behavior. In Eyetrack III, specific design changes on the news websites were controlled to pinpoint how viewing behavior differed across different design elements. For example, homepage No. 1 and homepage No. 6 contain the same content, but homepage No. 6 includes blurbs below the headlines. Because everything else on the pages is the same, we believe that any differences in the data between the two pages may be caused by the thing that is different -- the presence or absence of the blurbs. When there are multiple variables that are different across two or more versions that are being compared, it becomes more difficult to establish how each of the variables contributes to the difference. This challenge is particularly difficult when the number of test participants is small. By limiting the differences to just one variable, we can be more assured of the cause of the difference.

    16. What variables did you test in Section 1 of the Eyetrack III testing?

    Homepage variables:
    •Homepage templates (five designs commonly used by news websites)
    •Headline links only vs. headline-plus-blurb links
    •Headline quantity (more vs. fewer headlines)
    •Headline size (small vs. larger)
    •Text font size (small vs. larger)
    •Navigation placement (left, top, right)
    •Ad placement and types of ads (including text, animated, and rollover-expand ads)
    •Photo size (small vs. average vs. large)

    Article-page variables:
    •Design templates (five designs, related to five homepage templates)
    •Navigation placement (left, top, right)
    •Photos (none vs. average size vs. large size)
    •Text layout (1 column vs. 3 column)
    •Paragraph length (short vs. average vs. long)
    •Subheads vs. no subheads
    •Bulleted text vs. no bullets
    •Use of summary paragraph vs. none
    •Ad size and placement (including large "half-page," animated, and inset into article text)

    17. How did you create and design the 10 mock news websites?

    The Eyetrack III team first identified five news website designs most commonly used by the news industry. The team then worked with designers at Morris Digital Works -- who kindly volunteered their services -- to outline and develop the mock news websites to match the needs of the study. The mock websites were designed and produced by Morris designer Nik Wilets; they were hosted on the servers at Morris Digital Works. Content was drawn from a variety of Web news sources, and was chosen to be of interest when the actual testing took place. (That is, we chose mostly "evergreen" content that wouldn't appear dated, rather than deadline news.)

    18. I'm a member of the press. Who can give me more information about Eyetrack III?

    The following people can speak to you about this research:

    Steve Outing
    , project manager (senior editor, Poynter Institute)
      E-mail: steve@poynter.org
      Phone: 303-543-7810

    Laura Ruel
    , project manager (former executive director, Estlow Center for Journalism and New Media; assistant professor, University of North Carolina School of Journalism and Communication)
      E-mail: lruel@email.unc.edu
      Phone: 303-596-4691

    Colin Johnson
    , CEO, Eyetools Inc.
      E-mail: cjohnson@eyetools.com
      Phone: 415-235-0418