Recruit based on demographics or behavior?

Recruiting for usability test is hard. (I’ve said this before.) And it’s the most important thing to get right in a test. So how do you decide who to recruit?

Demographics don’t describe behavior
If you buy the argument of your marketing department, you will look at the demographics of the various segments and try to match their proportions. You’ll know the ages, incomes, educations, ethnicity, and genders of your participants. But does knowing this help you predict behavior or performance? More importantly, with a sample of, say, eight participants, can you generalize discovered usability problems to the broader cohort?

Probably not. Here’s an example of why.

Though most video gamers are male, some are female. The problems and successes they have in using a game are similar. And there will be differences within the genders, too. Though most video gamers are young, there are a lot that aren’t. The problems they have in using a game are not likely due to differences in age if the participants have similar expertise on the platform and with the game (or similar games).

Behavior describes performance
Instead, the differences in behavior (interaction between the person and the technology) and performance (whether the human is successful in completing technology-mediated tasks) are much more likely to stem from differences in expertise.

Being younger or older doesn’t make you an expert at anything necessarily. Having a higher or lower household income doesn’t, either. You could argue that education level might, but it usually doesn’t unless there’s something in the test that is related to a particular domain that the educated person was specifically trained for.

You want people to be motivated to do the tasks you want them to do when they get into your test situation. This is a place where it might make it easier or more difficult to find people. For example, if you want to test an online banking service or find out if someone might sign up for a brokerage account online, it’s more likely that the participants will fall into a “mature” category on the age scale than at the younger end or the very old end. And that is just because people in the mature range are more likely to have or want a mortgage than someone who is younger and isn’t in the market to buy a house or someone who is older who really would rather have a reverse mortgage. But you might find some on either end, too. But you want to see a range of people with different aptitudes and skill levels.

How do you recruit, then?

Minimize the demographics for small tests, focus on knowledge and proficiency
Skip the demographic questionnaire (or minimize it at least) and focus on what participants have done related to what you’re testing.
If you are doing a test of a Web site, you might care about what kinds of things do participants do on the Internet and how often they do it. Also, when was the last time? For example, what’s the last thing they bought online? Purchasing at an e-commerce site, no matter how well designed the site is, involves complex interaction. It might be a reasonable proxy for searching, narrowing a search, going through a decision process, filling in online forms, handling error and information messages, understanding where in an online process they are, and so on. But it doesn’t matter how old participants are, how educated they are, or (usually) what their household income is.

If you’re testing how well text messaging works, you want to know whether people do it already and how much. If they don’t do texting, you might want some people in your study who have received messages but don’t send them. By asking what their recent experiences were related to what you want to test (without giving away your tasks), you can find out about motivation as well as expertise.

And this brings us to a discussion about “novice” versus “expert.” But that’s another post.

Does Geography Matter?

Today I’ve been writing for the new edition of Handbook of Usability Testing about setting up a test environment. Should you be in the lab or in the field? If you’re in the lab, what should the setup be like and why? These seemed like fairly easy questions to answer. But then I got to a question that I’ve been wondering about myself for years: Does geography matter?

Nielsen says it doesn’t

Jakob Nielsen’s April 30, 2007 Alertbox (http://www.useit.com/alertbox/user-test-locations.html) says that geography doesn’t matter (unless there are international considerations or a single industry dominates the location or a couple of other things). “You get the same insights regardless of where you conduct user testing, so there’s no reason to test in multiple cities. When a city is dominated by your own industry, however, you should definitely test elsewhere.”

I sent my question around to several usability testing experts. Jared Spool sent one of the most interesting, but nearly everyone had experience that indicates that geography does matter.

Spool, Killam, and James say it does matter

“Remember,” Jared Spool says, “if you know everything [emphasis mine] there is to know about your users, their tasks, and their contexts, then you never need to test in the first place — all you need to do is be really smart and create a simple design. At that point, it boils down to a simple matter of programming.”

Bill Killam, of User-Centered Design put it this way:

Performance and subjective preference and motivation are all linked, so any change in location that affects one or more of these can be a factor across all of them. But we usually find it appears only in subjective data – not as much in behavioral observations. Even local variations like testing within the client’s office versus a “neutral lab” sometimes have noticeable effects on things like projected responding. However, also consider regional differences in the use or exposure to the product being tested. That will certainly effect results. Not to use too specific an example, but consider testing voting machines in the DC area versus a rural location. Or DC where paper and DREs [direct recording electronic voting machines] already exist versus NY where a full face ballot is used versus Oregon where all votes are by [mail].

Janice James contributed, “I’ve found that it IS important to test across multiple locations because I’ve found that the users do differ in terms of their experience level and exposure to product types, and technology, in general.”

Professor Spool and I continued the conversation by IM:

Dana: Okay, so it seems like your answer and Jakob’s article come from different assumptions. Jakob seems to assume that the field work is done. The team knows the context, etc. You seem to be saying that teams don’t always do the field work, first. By Nielsen’s parking meter example, the design team seems to have some background about the location.

Jared: Except teams always think they know everything.

Dana: I also think Jakob is assuming a fairly mature UX [user experience] group.

Jared: But, Jakob says, except for the few special cases discussed below, we’ve always identified the same usability findings, no matter where we tested. By now, we can clearly conclude that it’s a waste of money to do user testing in more than one city within a country. Good thing he wasn’t testing soda. Or pop. Or coke.

Dana: Yes, to your example, testing IA [information architecture] is a REALLY good reason to test in multiple locations. And the design team always will get some benefit from being on site – usually something that wasn’t predictable.

Dana: And with the audience for this book, I think it’s safe to assume that they won’t have done much (or any) field work before doing usability testing.

Jared: Right.

Jared: Testing in more than one locale is definitely a luxury.

Jared: I wouldn’t not test at all because you can’t get to more than one venue. Another approach is to make it work great for the local community and look to support and other feedback channels to hear if regional differences pop up. It’s the cross-your-fingers approach to design. It’s worked well through the centuries. Another approach is to look at other competitive/comparable designs for things that might be regional. If the designs have elements that seem different, is there a regional explanation?

Jared: Many design issues are just pure human behavior, independent of any cultural or regional issues.

Dana: I believe that.

Jared: Rolf [Molich] and Carolyn [Synder] did a study where they tested people in two countries on the same sites. They found 80% of the problems were in common. They found regional biases. People in Europe didn’t understand the purpose of a gift registry (and found it to be quite vulgar). But, if you perfected the design for your local venue, you’d nail 80% of the problems found anywhere else, if you extrapolate their results. And that’s a pretty good hit rate for a small budget.

Dana: I agree.

Jared: My guess is that’s what Jakob was trying to say.

Dana: That’s possible.

Jared: It’s hard to say with his shield of impenetrable ego obscuring the real intent.

Dana: Do you mind if I clean up this thread and use it in a blog post?

Jared: Not at all.

Jared: You can even leave in the impenetrable ego comment.

Dana: Makes it more believable that it was a conversation with Jared Spool.

Jared: Remember, all elephants are tall and flat, except for the instances when they are long and skinny.

Dana: That’s right. Anyway, thanks for answering the email and for continuing the discussion. I appreciate it.

Jared: I’m saying his exceptions are the generalized case. And his generalized declaration is rarely executable.

The Hardest Part: Getting the right participant in the room

This week has proved to me that that nothing — nothing — matters as much as having the right participants.

Without the right participants, it all falls apart
If you don’t have participants who are appropriate, you can’t learn what you want to learn because they don’t behave and think the way real users do. You may get data, but what does it mean? Not much.

Who’s the right participant?
The right participant is a person. Not a set of demographics or psychographic data taken from market segementations. It’s easy to lose sight of the idea that the person sitting in the chair using the product you’re testing is a person and not a tool for you to identify design problems – a substitute for you. He or she is a person with a personality, habits, memories, beliefs, attitudes, abilities, intelligence, experience, and relationships. You want the person to bring those things with them (along with their computer glasses). That’s the stuff of mental models. That’s what makes the sessions interesting and unpredictable.

How do you know?
You should be able to visualize who participant-person will be by talking about the kinds of things you want them to do in the session. Here’s an example from a study I’m working on right now. We want

 

Someone who travels at least a few times a year and stays a couple of nights in a hotel on each trip. This person books his own travel because it’s quicker and easier than giving instructions to someone else. He likes to book online because he can see options and amenities that inform his final decisions. This traveler knows where he’s going, how to get there, and what to do on arrival.

 


There’s a task with a context: booking travel accommodation online. There are motivations: it’s comparatively easy and there’s decision-making information available that isn’t otherwise. There is a level of experience in the task domain: traveling a few times a year and staying in hotels.

Visualizing participants this way is a technique I borrowed from User Interface Engineering.

You can create a screening questionnaire from that description that should get you appropriate participants. And look, there are very few selection criteria embedded in the visualization. We don’t care what the annual household income is, or the education level, or even what the person’s job is. Don’t make this too hard for yourself by collecting data you’re not going to use. (Besides, then you have to protect that personal information, but I’ll talk about that later.)

Now, share your test objectives and your visualization of the participant with your recruiter.

 

Stay tuned for much more about recruiting, like how to work with a recruiter, where to find the right participants, and lessons that Sandy and I have learned through dozens of recruits.