Usability testing in the wild – ballots

I’ve been busy the last few weeks doing some of the most challenging usability testing I’ve ever done. There were three locations where I did day-long test sessions. But that wasn’t the challenging part. The adventure came in testing ballots for the November election.

What was wild about it?
This series of tests came together through a project with the Brennan Center for Justice and the Usability Professionals’ Association. The Brennan Center released a report in July called Better Ballots, which reviewed ballot designs and instructions, finding that

  • hundreds of thousands of voters have been disenfranchised by ballot design problems
  • there has been little or no federal or state guidance on ballot design that might have been helpful to elections officials who define and design ballots at the local level
  • usability testing is the best way to ensure that voters can use ballots to vote as they intend


Also in the report, the Brennan Center strongly urged election officials to conduct usability tests on ballots. The recommendation to include usability testing in the ballot design process is a major revelation in the election world. The UPA Voting and Usability Project has developed the LEO Usability Test Kit to help local elections officials to do their own simple, quick usability tests of ballot designs.

But not all local elections officials were ready to do their own usability tests, and some wanted objective outsiders to help evaluate ballots for this particular, important upcoming election.

I did tests in three locations — Marin County, California, Los Angeles County, California, and the home of Las Vegas in Clark County, Nevada — with about 40 participants across the three locations. Several other UPA volunteers conducted tests and reviews in Florida, New Hampshire, and Ohio. In addition, UPAers trained local elections officials on usability testing and the LEO Test Kit in Ohio, Iowa, and a couple of other spots I can’t think of right now.

Pulling together a test in just a few days, including recruiting and scheduling participants
The Brennan Center report was released toward the end of July. Most ballots must be ready to print or roll out right now, the middle of September. The Brennan Center sent the report to every election department in the US and the response was great. Most requests came in in August, so among the five or six UPA Usability and Voting Project members available, we scrambled to cover the requests for tests.

We had the assistance of one of the Brennan Center staff to help coordinate recruiting, although it took some pretty serious networking to get people in to sessions on short notice, often within a few days.

The Brennan Center covered the expenses, but the time and effort spent by the people who worked with local elections officials and conducted the sessions was purely pro bono.

Not knowing what I would be testing until I walked onto the site
For two out of the three tests, I hadn’t seen exactly what I was going to be testing until I walked in the door of the election department. (I got the other ballot two days before the test.) This happened for a couple of reasons. Sometimes the local election official didn’t have a lot of information about what could be evaluated and how that might happen. Sometimes the ballot wasn’t ready until the last minute because of final filing deadlines or other constraints. Sometimes it was all of the above.

Fortunately, the main task is pretty straightforward: Vote! Use the ballot as you normally would. But there are neat variations. Are there write-ins possible? On an electronic voting machine, how do you change a vote? What if you’re mailing in a ballot – what’s different about that and how do design and instructions have to compensate for not having poll workers available to ask questions of?

Giving immediate results and feedback
So, we got copies of ballots or something close to final on an electronic voting machine. We’ve met briefly with the local elections officials (and often with their advisory committees). We’ve recruited participants (sometimes off the street). We’ve conducted 8 or 10 or 15 20-minute sessions in one day. Now it’s time to roll up what we saw in the sessions and to talk with the person who owns the ballot about how the evaluations went.

Handling enthusiastic observers and activists
A lot of people are concerned with the usability, accessibility, and security of ballots and voting systems. You probably are. Some are more concerned about it than others. Those are the people who show up to observe sessions. They’re well informed, they’re enthusiastic, and they’re skeptical. The observers and activists (many signed up to be test participants) were also keenly interested in understanding this activity. How was this different from focus groups or reviews by experts? How do we know that the problems we’ve witnessed are generalizable to other voters in the jurisdiction?

The good news: Mostly, the ballots worked pretty well. The local elections officials usually have the ability to make small changes at this stage and they were willing, especially to improve instructions to voters. By doing this testing, we were able to effect change and to make voting easier for many, many voters. (LA County alone has more than 3 million registered voters.)

Links:
Brennan Center for Justice report Better Ballots
http://www.brennancenter.org/content/resource/better_ballots/

UPA’s Voting and Usability Project

http://www.usabilityprofessionals.org/civiclife/voting/
voting@usabilityprofessionals.org

LEO Usability Testing Kit
http://www.usabilityprofessionals.org/civiclife/voting/leo_testing.html

Ethics guidelines for usability and design professionals working in elections
http://www.usabilityprofessionals.org/civiclife/voting/ethics.html

Information about being a poll worker
http://www.eac.gov/voter/poll%20workers

EAC Effective Polling Place Designs
http://www.eac.gov/election/effective-polling-place-designs

EAC Election Management Guidelines
http://www.eac.gov/election/quick-start-management-guides

Retrospective review and memory

One of my favorite radio programs (though I listen to it as a podcast) is Radiolab, “ a show about science,” which is a production of WNYC hosted by Robert Krulwich and Jad Abmurad and distributed by NPR. This show contemplates lots of interesting things from reason versus logic in decision making to laughter to lies and deception.

The show I listened to last night was about how memories are formed. Over time, several analogies have developed for human memory that seem to be related to the technology available at that time. Robert said he thinks of his memory as a filing cabinet. But Jad, who is somewhat younger than Robert, described his mind as a computer hard disk. Neurologists and cognitive scientists they talked to, though, said No, memory isn’t like that at all. In fact, we don’t store memories. We recreate them every time we think of them.

Huh, I thought. Knowing this has implications for user research. For example, there are several points at which usability testing relies on memory: the memory of the participant if we’re asking questions about the past behavior; the memory of the facilitator for taking notes, analyzing data, and drawing inferences; the memories of observers in discussions about what happened in sessions and what it means.

Using a think-aloud technique – getting participants to say what they’re thinking while working through a task – avoids some of this. You have a verbal protocol as “evidence.” If there’s disagreement about what happened among the team members, you can go back to the recording to review what the participant said as well as what they did.

But there are times when think-aloud is not the right technique, either because the participant cannot manage the divided attention of doing a task and talking about it at the same time, or because of other circumstances. In those situations, you might think about doing retrospective review, instead.

“Retrospective review” is just a fancy name for asking people to tell you what happened. If you have the tools and time available, you can go to a recording after a session, so the participant can see what she did and respond to that by giving you a play-by-play commentary.

As soon as participants start viewing or listening to the beginning of an episode – up to 48 hours after doing the task – they’ll remember having done it. They probably won’t be able to tell you how it ended. But they will be able to tell you what’s going to happen next.

And that’s the really useful thing about doing retrospective review. As the participant recreates the memory of the task, you can ask, “What happens next? What will you do next and why?” Pause. Listen. Take notes. And then start playing back the recording again. Sure enough, it’ll be like the participant said. Only now you know why.

Asking participants what happens next in their own stories also avoids most revisionist history. That is, if you ask participants to explain had what happened after they view it, they may rationalize what they did. This isn’t the same as remembering it.

Getting ready for sessions: Don’t forget…

There are a bunch of things to do to get ready for any test besides designing the test and recruiting participants.

  • make sure you know the design well enough to know what should happen as the participant uses it
  • copy any materials you need for taking notes
  • copy of all the forms and questionnaires for participants, including honorarium receipts
  • organize the forms in some way that makes sense for you. (I like a stand-up accordion file folder, in which I sort a set of forms for each participant into each slot. I stand up the unused sets and then when they’ve been filled out, they go back in on their sides.)
  • check in with Accounting or whoever on money for honoraria or goodies for give-aways
  • get a status report from the recruiter
  • double-check the participant mix
  • make sure you have contact information for each participant
  • check that you have all the equipment, software, or whatever that you need for the participant to be able to do tasks
  • run through the test a couple of times yourself
  • double-check the equipment you’re going to use (I use a digital audio recorder, so I need memory sticks for that, along with rechargeable batteries)
  • charge all the batteries
  • double-check the location

Which gets us to where you’re going to do the sessions. But let’s talk about that later.

Where usability testing fits into your research strategy

What, you don’t have a research strategy? Let’s think about the future here.

It’s not uncommon – and not bad – to be working in the present, reacting to the ever-growing demand for usability testing in your organization. “Ever-growing” is good. But when Jared Spool asked me to do a podcast with him recently to talk about what I think makes the difference between a good user experience team and a great user experience team, it got me thinking.

The recipe, based on my observations in dozens of corporations, comes down to these three main ingredients:

  • Vision
  • Strategy
  • Involvement

Vision is an overused word, but here I mean that you and your team have visualized the ideal customer experience — no limits, no constraints. Imagine the best possible interactions a customer could have with your organization at every touch point. Write it down.

Strategy means that you have a plan for reaching the vision. Over the long term, you can learn about and take into account customers’ contexts and goals while matching those up to the goals and objectives of the business.

Involvement calls all interested people in the business together (and that really should be everyone from management to design to development to support and anyone else in the organization) to embrace the vision and carry out the strategy across disciplines.

But I haven’t said much about usability testing yet. Where does it fit in? Everywhere. Part of my strategy would be to teach as many people in the organization to do usability testing as possible. You probably can’t do all the testing that is wanted (let alone needed). If you teach others to do it and coach them along the way, the customer ultimately benefits as the organization gains a closer, smarter understanding of the customer experience and can make evidence-based decisions about how to get to the ideal experience it shares a vision of.

Usability testing and democracy: evaluating ballot designs makes the headlines

Today the Brennan Center for Justice at the law school at NYU released a major report about the impact of poor ballot designs and unclear instructions on voters and the importance of usability testing.
Among the highlights is an overview of the Usability Professionals’ Association (UPA) usability testing kit for local election officials (the LEO Usability Testing Kit). Members of the UPA Usability in Civic Life Project are working with Brennan Center to provide direct training for election officials.
The report is titled Better Ballots, and can be found on the Brennan Center site:
http://www.brennancenter.org/

http://www.brennancenter.org/content/resource/better_ballots/
The report released today, and 3 articles in USA Today and the New York Times highlight it.
Study: Poor ballot designs still affect U.S. elections
http://www.usatoday.com/news/politics/election2008/2008-07-20-ballots_N.htm

Ballot designs are ‘literacy test for voters’
http://www.usatoday.com/news/politics/election2008/2008-07-20-ballot-inside_N.htm

Influx of Voters Expected to Test New Technology
http://www.nytimes.com/2008/07/21/us/21voting.html

Stop writing reports

I’ve had a few questions from readers lately about standardizing reports of usability test results. Why is there no report template in the Handbook? There’s no “template” for a final report because I think you probably shouldn’t be writing reports. Or at least written reports should be minimal. Mini.mal. Though the outline should basically be what’s in the Handbook, what you put in your report depends on

  • The test design and plan
  • What your team needs and can use

And let’s use “report” in the loosest possible way: delivering information to others. That’s it. Your report doesn’t have to be a long, prose-based, descriptive tome. (Not that there’s anything wrong with that.) And the delivery method doesn’t have to be paper.



That leaves a lot of options, from an email with a bulleted list of items, to a “top line” post on a blog or wiki that lightly covers the main trends and patterns. In the middle of the range might be a classic usability test report that describes results and findings in some detail. (I personally dislike slide decks as reports, but a lot of organizations do them.) These will all work for any type of test. For summative tests, you may want to go as far as the CIF, or Common Industry Format, established by the International Standards Organization. BUT if your team has observed the sessions and attended the debriefs, you probably don’t need much of a report. They won’t read it; everything has been discussed and decided already. Whatever you deliver is simply a record of that set of decisions and agreements.

Making it easy to collect the data you want to collect

As I have said before, taking notes is rife with danger. It’s so tempting to just write down everything that happens. But you probably can’t deal with all that data. First, it’s just too much. Second, it’s not organized.

Let’s look at an example research question: Do people make more errors on one version of the system than the other?

And we chose these measures to find out the answer:

  • Count of all incorrect selections (errors)
  • Count and location of incorrect menu choices
  • Count and location of incorrect buttons selected
  • Count of errors of omission
  • Count and location of visits to online help
  • Number and percentage of tasks completed incorrectly

Continue reading “Making it easy to collect the data you want to collect”

Translating research questions to data

There’s an art to asking a question and then coming up with a way to answer it. I find myself asking, What do you want to find out? The next question is How do we know what the answer is?

Maybe the easiest thing is to take you through an example.

Forming the right question

On a study I’m working on now, we have about 10 research questions, but the heart of the research is about this one:

Do people make more errors on one version of the system than the other?

Note that this is not a hypothesis, which would be worded something more like, “We expect people to make more mistakes and to be more likely to not complete tasks on the B version of the system than on the A version of the system.” (Some would argue that there are multiple hypotheses embedded in that statement.)

But in our study, we’re not out to prove or disprove anything. Rather, we just want to compare two versions to see what works well about each one and what doesn’t.

 

Choosing data to answer the question

There are dozens of possible measures you can look at in a usability test. Here are just a few examples:

Continue reading “Translating research questions to data”

Data collecting: Tips and tricks for taking notes

A common mistake people make when they’re new to conducting usability tests is taking verbatim notes.

Note taking for summative tests can be pretty straightforward. For those you should have benchmark data that you’re comparing against or at least clear success criteria. In that case, data collecting could (and probably should) be done mostly by the recording software (such as Morae). But for formative or exploratory tests, note taking can be more complex.

Why is it so tempting to write down everything?

Interesting things keep happening! Just last week I was the note taker for a summative test in which I noticed (after about 30 sessions), that women and men seemed to be holding the stylus for marking what we were testing differently and that it seemed that difference was causing a specific category of errors.

But the test wasn’t about using the hardware. This issue wasn’t something we had listed in our test plan as a measure. It was interesting, but not something we could investigate for this test. We will include it as an incidental observation in the report as something to research later.

Note taking don’ts

  • Don’t take notes yourself if you are moderating the session if you can help it.
  • Don’t take verbatim notes. Ever. If you want that, record the sessions and get transcripts. (Or do what Steve Krug does, and listen to the recordings and re-dictate them into a speech recognition application.)
  • Don’t take notes on anything that doesn’t line up with your research questions.
  • Don’t take notes on anything that you aren’t going to report on (either because you don’t have time or it isn’t in the scope of the test).

 

Tips and tricks

  • DO get observers to take notes. This is, in part, what observers are for. Give them specific things to look for. Some usability specialists like to get observer notes on large sticky notes, which is handy for the debriefing sessions.
  • DO create pick lists, use screen shots, or draw trails. For example, for one study, I was trying to track a path through a web site to see if the IA worked. I printed out the first 3 levels of IA in nested lists in 2 columns so it fit on one page of a legal sized sheet of paper. Then I used colored highlighters to draw arrows from one topic label to the next as the participant moved through the site, numbering as I went. It was reasonably easy to transfer this data to Excel spreadsheets later to do further analysis.
  • DO get participants to take notes for you. If the session is very formative, get the participants to mark up wireframes, screen flows, or other paper widgets to show where they had issues. For example, you might want to find out if a flow of screens matches the process a user typically follows. Start the session asking the participant to draw a boxes-and-arrows diagram of their process. At the end of the session, ask the participant to revise the diagram to a) get any refinements they may have forgotten, b) see gaps between their process and how the application works, or c) some variation or combination of a and b.
  • DO think backward from the report. If you have written a test plan, you should be able to use that as a basis for the final report. What are you going to report on? (Hint: the answers to your research questions, using the measures you said you were going to collect.)