Unmoderated, Remote Usability Testing: Good or Evil?

By Kyle Soucy

Published: January 18, 2010

“Recently, there has been a surge in the number of tools that are available for conducting unmoderated, remote usability testing—and this surge is changing the usability industry.”

Conducting traditional synchronous, or moderated, usability testing requires a moderator to communicate with test participants and observe them during a study—either in person or remotely. Unmoderated, automated, or asynchronous usability testing, as the name implies, occurs remotely, without a moderator. The use of a usability testing tool that automatically gathers the participants’ feedback and records their behavior makes this possible. Such tools typically let participants view a Web site they are testing in a browser, with test tasks and related questions in a separate panel on the screen.

Recently, there has been a surge in the number of tools that are available for conducting unmoderated, remote usability testing—and this surge is changing the usability industry. Whether we want to or not, it forces us to take a closer look at the benefits and drawbacks of unmoderated testing and decide whether we should incorporate it into our usability toolbox.

To clarify, there are a lot of tools out there that label themselves as usability testing tools, but don’t actually offer the capability of doing usability testing with users through task elicitation. Some of these tools are nothing more than survey tools, Web analytics tools with new and improved visuals—such as CrazyEgg and clickdensity—or Web analytics tools that turn analytics data into videos of actual user sessions—such as Userfly, ClickTale, TeaLeaf, and Clixpy. All of these tools provide a wealth of data about your Web site’s users. However, such tools are not the focus of this article. Instead, this article focuses on unmoderated usability testing tools that actually simulate traditional usability testing by asking participants to complete a series of tasks using your user interface and answering questions about their experience.

What You Can Learn

“Unmoderated usability testing lets you do test sessions with hundreds of people simultaneously, in their natural environment.”

Unmoderated usability testing lets you do test sessions with hundreds of people simultaneously, in their natural environment, which, in turn, provides quantitative and even some qualitative data. The exact metrics and feedback you can collect vary, depending on the tool you use. (I’ll provide a list of unmoderated usability testing tools later.) Most unmoderated testing tools can gather the following quantitative data:

  • task-completion rate
  • time on task
  • time on page
  • clickstream paths
  • satisfaction ratings or opinion rankings
  • Web analytics data—such as browser, operating system, and screen resolution

Most of these tools can also capture qualitative feedback as users complete their tasks—such as users’ suggestions and comments. This is where the true value of unmoderated usability testing can come into play.

Some unmoderated testing tools can recruit users for tests by intercepting them on your live Web site. This lets you collect invaluable data on participants’ true intent and motivation for visiting your Web site.

How Actionable Is the Data?

“How actionable your data is depends heavily on the types of tasks you ask participants to perform. … The self-reported feedback and comments you get in response to open-ended questions can be the most valuable data you collect during an unmoderated test.”

How actionable your data is depends heavily on the types of tasks you ask participants to perform. If you have participants perform scavenger-hunt tasks—asking them to find specific content on a Web site—you may miss out on important feedback. Just because someone was able to find the information you requested doesn’t mean they understood it. To elicit more valuable information, you should try to make finding tasks more meaningful by having participants answer a question about the information they were asked to find. For example: Using the Web site, please find out which Smartphones are available on Verizon Wireless. Where can you purchase these phones locally?

The self-reported feedback and comments you get in response to open-ended questions can be the most valuable data you collect during an unmoderated test. Sometimes users’ direct quotations can be just as impactful as videos, especially when you start to see a consensus building among different participants.

Take satisfaction ratings and opinions with a grain of salt. Pay closer attention to what users actually do—not what they say they do. Participants can have a terrible experience using a user interface and still give it a high satisfaction rating. For this reason, I suggest asking participants open-ended questions about their experience rather than having them rate it.

Also, you must keep in mind that Web analytics alone cannot paint the full picture. Just because it took someone longer to complete a task doesn’t mean it was harder to complete. They could just have been more interested in the content. Without asking participants, you don’t really know for sure. You must be careful not to make hasty assumptions that are based on just the quantitative data you’ve collected.

Conducting Unmoderated Usability Tests

Who participates in your test is just as important for an unmoderated usability test as it is for a moderated test. Your team will base important design decisions on the data you obtain, so participants should be real or prospective users of a product.”

Creating and administering an unmoderated usability study is similar to the process of creating and administering an online survey, but with the additional steps of a traditional usability study, as follows:

  • Define the study. Decide what tasks you are going to ask participants to perform, the order of the tasks, and what follow-up questions you want to ask them about their experience. Unfortunately, since you are not observing the tests, you can’t ask probing or follow-up questions on the fly, depending on what participants do. However, some unmoderated usability testing tools let you structure tests to ask probing questions after users’ perform specific interactions with a user interface.
  • Recruit participants. You can choose to do the recruiting yourself or hire a recruiter. As I mentioned earlier, some unmoderated testing tools offer you the options of either intercepting users on your live Web site or recruiting them from the tool developers’ own panels of participants—which are pools of test participants they’ve recruited in advance. You should be careful when choosing participants from such panels as your representative users. Who participates in your test is just as important for an unmoderated usability test as it is for a moderated test. Your team will base important design decisions on the data you obtain, so participants should be real or prospective users of a product.
  • Launch your test and send email invitations. Typically, an unmoderated test should be only 15–30 minutes in duration—comprising approximately 3–5 tasks—because the dropout rate tends to increase if a test takes longer.
  • Analyze your results. Most unmoderated testing tools offer live, real-time reporting during tests.

Benefits and Drawbacks

“Nothing beats watching participants in real time and being able to ask probing questions about what they are doing as it’s happening—and you’ll miss out on this opportunity.”

Before choosing to conduct unmoderated usability tests, it’s best to take a look at their benefits and drawbacks in comparison to traditional moderated testing.

Benefits of unmoderated usability testing include the following:

  • You can test hundreds of people simultaneously—while keeping them in their own natural environment.
  • You can test multiple Web sites simultaneously—for example, competitor Web sites, different brands, or Web sites for different countries.
  • You can test at a reduced cost—depending on the tool you use. There are definitely unmoderated usability testing tools that have ridiculously high prices, but some recent tools are very affordable, which can make unmoderated usability testing a less expensive option. (See my list of unmoderated usability testing tools.) Also, the participant honorariums for unmoderated tests are typically a lot lower.
  • Doing unmoderated usability testing is a great way of planting the seed of UCD methodologies and introducing usability testing into a company, using limited resources and budget—assuming you can use one of the less expensive testing tools.
  • There are fewer logistics to manage, with no need to set up testing schedules, set up and moderate individual test sessions, or worry about no-shows and getting last-minute replacements.

Drawbacks of unmoderated usability testing include the following:

  • Nothing beats watching participants in real time and being able to ask probing questions about what they are doing as it’s happening—and you’ll miss out on this opportunity.
  • Some participants may be interested only in earning the honorarium you’ve provided as an incentive. So, rather than taking the time to really perform each task and provide feedback, they’ll just click through the tasks without much thought. Luckily, you can filter such participants out of your findings by looking at their time on task or open-ended feedback. Depending on the capabilities of your chosen testing tool, this task can either be time consuming or quite painless.
  • You cannot conduct interview-based tasks. Participants who are passionate about the tasks they are performing interact with a user interface differently from those who are just doing what they are told.
  • Web analytics can mislead you by giving a wrong impression of a user’s experience. Also, what participants report on surveys can be very different in comparison to what they actually do. You can’t rely solely on rankings and satisfaction ratings to create an accurate picture of what your users actually need and want. Therefore, you should always include qualitative research questions in your unmoderated studies and analyze the self-reported feedback. If necessary, follow up with participants after a study to discuss their feedback.
  • It’s possible for participants to think they’ve successfully completed a task when they haven’t. To move on to the next task, participants must be able to decide whether they’ve completed their current task. For this reason, you need to develop straightforward tasks that have well-defined end states.

When Should You Conduct an Unmoderated Test?

“Sometimes, you need greater numbers to give stakeholders the warm-and-fuzzy feeling they need to make million-dollar design decisions. … Use your large samples from unmoderated testing to help put big numbers behind some key findings from your initial moderated research.”

Have you ever presented findings from a moderated usability test only to receive push back on the data, because only 5–10 people participated in your study? As usability professionals, we know when our data is actionable. (How many times do you need to see someone fail to complete a task before you know it’s a problem?) But, sometimes, you need greater numbers to give stakeholders the warm-and-fuzzy feeling they need to make million-dollar design decisions.

Unmoderated usability testing can yield an enormous amount of data and feedback from participants, but you should not use it as a replacement for moderated usability testing. Unmoderated testing is best when you use it in conjunction with moderated testing. Use your large samples from unmoderated testing to help put big numbers behind some key findings from your initial moderated research.

Keep this in mind:

  • Moderated testing is still much better suited for multifaceted products or complex tasks that don’t have a structured sequence of steps.
  • Unmoderated testing is most effective when you have very specific questions about how people use a user interface for relatively simple and straightforward tasks.

Overview of Some Unmoderated Testing Tools

“New unmoderated testing tools are constantly appearing, so I urge you to use this list only as a starting point in your process of finding the best tool for your needs.”

Please note that this is not a complete list of all available unmoderated testing tools. New unmoderated testing tools are constantly appearing, so I urge you to use this list only as a starting point in your process of finding the best tool for your needs. Also, the pricing and feature list for each tool is current only as of this article’s date of publication. Since pricing and feature sets can change, you should visit these companies’ Web sites for the most current information about their offerings.

Keynote WebEffectiveKeynote WebEffective

Self-service option: Yes

Software download required: Yes

Recruiting options: Intercept users, use their panel, or do it yourself

Pricing: $$$$

UserZoomUserZoom

Self-service option: Yes

Software download required: Yes, but there is an optional version, without clickstream tracking, that doesn’t require a download.

Recruiting options: Intercept users, use their panel, or do it yourself

Pricing: $$$$

RelevantViewRelevantView

This tool includes the ability to conduct card sorts and Chalkmark-like tests—see Chalkmark.

Self-service option: Yes

Software download required: No

Recruiting options: Use their panel or do it yourself

Pricing: $$$$

Webnographer

Self-service option: No

Software download required: Yes

Recruiting options: Intercept users, do it yourself, or they’ll recruit for you

Pricing: $$$$

Note—No screenshots were available.

Morae AutopilotMorae Autopilot

This is an in-person, unmoderated testing tool. It does not provide the ability to conduct unmoderated, remote testing.

Self-service option: Yes

Software download required: Morae Recorder must be installed on the computer running the test.

Recruiting options: Do it yourself

Pricing: $$

Note—This pricing is for a one-time purchase of unlimited usage of Autopilot, and you are also buying the entire Morae package, which you can use for moderated, as well as unmoderated testing.

Loop11Loop11

This low-cost option offers many of the same benefits and features as the higher-priced tools.

Self-service option: Yes

Software download required: No

Recruiting options: Do it yourself

Pricing: $

OpenHallwayOpenHallway

This tool also records on-screen interactions and audio.

Self-service option: Yes

Software download required: No

Recruiting options: Do it yourself

Pricing: $

UserTesting.coUserTesting.com

This tool also records on-screen interactions and audio.

Self-service option: Yes

Software download required: No

Recruiting options: Panel, but do it yourself is a custom option.

Pricing: $

EasyUsability.comEasyUsability.com

Self-service option: Yes

Software download required: No

Recruiting options: Panel only

Pricing: $

UsabillaUsabilla

This tool uses task elicitation to collect opinions and feedback and also to find out where people first click—either a static image or a Web address, or URL—to complete a task.

Self-service option: Yes

Software download required: No

Recruiting options: Do it yourself

Pricing: $

ChalkmarkChalkmark

This tool uses task elicitation to find out where people first click a static image to complete a task.

Self-service option: Yes

Software download required: No

Recruiting options: Do it yourself

Pricing: $

TreejackTreejack

This tool uses task elicitation to find out what link people first click in an information architecture to find information.

Self-service option: Yes

Software download required: No

Recruiting options: Do it yourself

Pricing: $

Resources

de la Nuez, Alfonso. “An Attainable Goal: Quantifying Usability and User Experience.” (For subscribers only.) User Experience Magazine, Volume 7, Issue 3, 2008. Retrieved September 10, 2009.

—— and Kim Oslob. “What’s the Real Value Behind Unmoderated Remote User Testing? UserZoom Blog, September 9, 2009. Retrieved January 15, 2010.

Farnsworth, Carol. “Getting Your Money Back: The ROI of Remote Unmoderated User Research.” (For subscribers only.) User Experience Magazine, Volume 7, Issue 3, 2008. Retrieved September 10, 2009.

Mach, Sabrina. “Is All Remote Usability Testing The Same? FeraLabs Blog, February 24, 2009. Retrieved September 10, 2009.

Tulathimutte, Tony. “Read Chapter One of Remote Research! Rosenfeld Media, January 26, 2009. Retrieved September 2, 2009.

Tullis, Tom. “Automated Usability Testing: A Case Study.” (For subscribers only.) User Experience Magazine, Volume 7, Issue 3, 2008. Retrieved September 10, 2009.

15 Comments

Thanks, Kyle, for sharing this. I used User Zoom for remote usability testing on a multi-phase project, in conjunction with moderated testing, and the final results were very good. In a couple of weeks, I will use User Zoom for an online survey.

You’re welcome, Camilo! Thank you for sharing your experience with conducting moderated testing in conjunction with unmoderated testing using User Zoom. Good luck with your survey!

Nice, but since it’s an article focused more on the technique and not so much a tool review, I have to say that it was a bit vague, to be honest, particularly when pricing is mentioned. I wish more attention had been paid to features and value instead.

URUT is actually a highly complex research method, and this must be taken into account. On one hand, Kyle talks about “ridiculously high prices” of some tools, when this judgment is completely dependant on the actual value the tool can provide. Just because new, low-cost tools have come out on the market in the past year or so, you can’t just assume the ones that have been out there for years, solving highly complex research problems with sophisticated engineering work are “ridiculously expensive.” Why is there this difference, one should wonder. Much the same has happened with other types of tools like online surveys or CRM systems, where no one labels SAP as “ridiculously expensive” because Salesforce.com offers a SaaS, low-cost version. SAP offers a lot more, therefore, it costs more. It’s that simple.

Thanks Camilo for the note! ;)

Hi Alfonso,

As you stated, the focus of this article is not to be a tool review, hence the limited information given on each tool mentioned. The list is provided only to help give a starting point in the process of finding a tool. I prefer to let the reader separate the calves from the herd, so to speak.

I refrained from giving numbers for pricing since some of the tools mentioned have custom pricing, depending on the needs of the research study. I also didn’t think it was appropriate for me to post the pricing, because some of these companies don’t even post it on their own Web sites—including yours, UserZoom. You’re absolutely right that my idea of high pricing might be different from someone else’s, and whether someone thinks a tool or service is valuable is entirely subjective.

In my research, I have found that a couple of these lower-cost tools actually provide a lot of the same results for a fraction of the price. But, with the majoirty of these tools, I agree that you get what you pay for.

Best, Kyle

Hi Kyle, I understand what you mean.

The question is: Did your research include actually using the tools for a real study and gathering results—like a trial study? Or, by research, do you mean requesting information from each vendor and reading the Web site? If it’s the second, I can understand why you came to the conclusion that “a couple of these lower-cost tools actually provide a lot of the same results for a fraction of the price,” because the information on features provided by most of the vendors’ Web sites does not clearly differentiate them. (We’re working on this now, btw.)

So I guess my point is that expensive or cheap really depends on value and value should / can only be measured after you use or try out a product, right?

I don’t mean to create too much controversy here, but I just think the reader should be fully informed. I appreciate the article being published in any case. ;)

Thanks, Kyle, for writing an article on URUT. There definitely is a huge trend toward gaining valuable metrics and quantitative data to validate findings. A few comments from a researcher perspective and from someone who has vast experience using this method.

“Nothing beats watching participants in real time….” While it is true this method does not replace watching participants, we must understand its value. Researchers don’t just use one method; we have many in our toolkit. The value is that it complements qualitative research and fills in where qualitative methods may lack. On the other end, you can say that nothing beats collecting quantitative metrics and participants walking through a site without being intimidated by being watched. I think what we have to look at is the value and how this method can get us in front of the stakeholders with more statistically significant data.

“Some participants may be interested only in earning the honorarium you’ve provided as an incentive….” This is true with even lab-based research. If it wasn’t, we wouldn’t have to offer incentives. However, with URUT, you can set up Quality Control filters to ensure participants are truly putting in an effort. In addition, you have more flexibility because of the numbers you are running. If you have a lab study and run only eight and two are bad participants, you need to run two more participants. With URUT, you can take out the outliers and bad data and still have statistically significant data.

“You cannot conduct interview-based tasks. Participants who are passionate about the tasks they are performing interact with a user interface differently from those who are just doing what they are told.” You don’t always have to give tasks with URUT. Rather, you can intercept customers as they come to a site and have them do natural tasks. At the end, you can ask them questions regarding their experience.

“Web analytics can mislead you by giving a wrong impression of a user’s experience. Also, what participants report on surveys can be very different in comparison to what they actually do. You can’t rely solely on rankings and satisfaction ratings to create an accurate picture of what your users actually need and want.” This is not just Web analytics with surveys. With the right technology or tool, you can apply validation to your tasks to ensure accurate collection of Success and Error ratios. Validation can be in the form of questions they can answer correctly only if they did the task correctly, URNs by Web page, or by time. With this method, you combine the paths—analytics—with the survey data and the validation to create a complete picture of what happened. There is no misleading in the user’s experience. In addition, you can ask open-ended questions to gain qualitative data, so you aren’t locked into just ratings and multiple-choice questions.

In short, I have learned that there are many misconceptions in regard to URUT. I encourage researchers to explore and take the time to understand the value in using this method. It will not only complete your usability toolkit, but open the doors to gaining rich data that you couldn’t otherwise collect in any other way.

Kim

Alfonso,

Actually, I have been fortunate enough to use some of these tools for real client studies. However, I haven’t used them all. Which is why I made a conscious effort not to evaluate the tools and to refrain from stating any personal opinions in the article.

When URUT tools first started coming out, I had a horrible opinion of them. Once I finally had the experience of using URUT with different tools and seeing the value of the method, I wanted to write this article so that other UX researchers gave it a fair shot before shooting it down as an option.

I’m in no way stating that any of these tools isn’t valuable. I believe all the tools mentioned received both fair and equal coverage relative to the intent of the article.

Kim,

First, I think readers need to understand that you also work for UserZoom.

“While it is true this method does not replace watching participants, we must understand its value. Researchers don’t just use one method; we have many in our toolkit … it complements qualitative research and fills in where qualitative methods may lack.” That is exactly why I highly suggested the benefit of using large samples from unmoderated testing to help put big numbers behind some key findings in initial moderated research.

“This is true with even lab-based research. If it wasn’t, we wouldn’t have to offer incentives.” I agree that this is true no matter how the testing is performed. The point was that although real users may be participating only because of the honorarium, their feedback is much different from someone who would never use the product, since future changes would not impact them. In fact, I’ve done many tests over the years without even offering an honorarium. Often times, the incentive for real users of products is to have the opportunity to make the products they use better.

“However, with URUT, you can set up Quality Control filters to ensure participants are truly putting in an effort.” While I agree that this is a valuable feature, the majority of URUT tools that I’m aware of don’t have that ability. Broad statements about what you can do with URUT should only be made if you know that it can be done with all tools, not just some. When you do this, you are defending the tools, not the method.

“You don’t always have to give tasks with URUT. Rather, you can intercept customers as they come to a site and have them do natural tasks.” Again, the majority of URUT tools that I’m aware of don’t have this feature. It’s wonderful that some tools have these capabilities, and I wish more URUT tools did as well.

“This is not just Web analytics with surveys … There is no misleading in the user’s experience … you can ask open-ended questions to gain qualitative data, so you aren’t locked into just ratings and multiple-choice questions.” I agree that open-ended questions are the best way to gain qualitative data, but if you present only the Web analytics, it can be misleading. That is how I stated this in the article.

~ Kyle

Thanks, Kyle. I believe it’s great to be having this discussion.

I didn’t mean to suggest that not all tools received fair and equal coverage. Again, I understood it was not a tool review type of article. My only point or concern is just with the adjective ridiculous. In this case, it was used when referring to pricing in a way that, in my humble opinion, may be misleading and requires further detailing.

Regarding Kim—or myself—working for UserZoom, I’d like to say that I wrote www.userzoom.com when I added these comments, but my name is linked somewhere else. I think that’s what happened to Kim, and that’s why she didn’t mention that she works for UserZoom.

I’d be very happy to offer you a demo of our tool. I’m quite confident that, when you see it, you’ll immediately understand what I’m talking about. I realize URUT is relatively new to many, but we didn’t just come out to the market. We’ve been doing this for 8 years now.

If you are interested, get in touch.

For the sake of completeness, let me introduce UserFeel, which we published last week. UserFeel’s remote usability testing differentiating factor is our large panel of multilingual users, so you can use it to test non-English sites as well.

To be fair, I would like to introduce your readers to Userlytics.com, providing a real-time, remote, unmoderated usability testing service that produces a high-quality video that consists of: full-screen recording of everything that transpires on the desktop, not just the browser, audio sound recording, and synchronized Web cam recording of the tester’s face, mouse movements, and clickstream. And, in the near future, Userlytics.com will be adding moderated testing that will allow the moderator to create annotations during a task, chat and direct a tester during a task, as well as full video editing capability.

Dear Kyle, there has been a small glitch above. Webnographer does not require a software download. Participants simply click a link in an email invite or on an advertising banner. This takes them straight to the research where Webnographer records all their interactions on the Web site while carrying out a task. This is also combined with pre- and post-task questions.

It is very important that a remote testing tool does not require a download. In the many studies we have carried out for our clients using Webnographer, a large number of people have participated in the research during office hours and from their work desk. If a download were required, many of those people would be excluded from partaking, because most of them wouldn’t have administrator rights to install software onto their work computer. Luckily Webnographer does not have that problem, because no download is required.

Where Webnographer clearly stands out from the other unmoderated tools is on the analysis side, going far beyond simple success rates. Webnographer provides intelligent analysis techniques built into the tool. It does not require any laborious Excel exports. All analysis happens in the Webnographer tool, including filtering of cohorts and satisficers, verifying success or failure, and much more.

We have started writing about our research experiences on our blog, where we also provide tips and tricks for conducting unmoderated usability tests. If you are interested, please visit.

Many thanks, Sabrina Mach

Webnographer

Thank you, UserFeel and Userlytics, for mentioning your tools, so people who read this article are aware of them.

Sabrina, thank you for pointing out the correction that Webnographer does not require a download. I was surprised when researching the tool last year that there weren’t any screenshots or product tours available, and I’m even more surprised that there still aren’t. Any chance you’ll be posting more information about this tool anytime soon?

This is a very interesting article. I’d like to add another alternative to the mix, which has launched since this article was originally written: Kupima.

Broadly similar to the approach taken by usertesting and userlytics, Kupima offers video-based remote testing using either our own panel of manually approved users or choosing to use your own users—via a special hyperlink. The difference is that Kupima also offers quite a degree of control over the tasks and question types, which, for discrete answers, are automatically charted, and we will shortly be releasing functionality to create an instant report in PDF format, showing feedback summaries and key insights, plus charts.

I’d be thrilled to bits if anyone here would also consider Kupima as an option. It’s feature-competitive, affordable, and has something the others don’t have: lots of orange men. :)

You can also check out beeux.com. Have fun!

Thank you for this text, very informative. It is true that nothing is better than testing the users in reality. However, well-constructed automated tests come pretty close. I wanted to add to the mix another tool that wasn’t around when the text was written, namely Automated Testing from Usability Tools. It is priced reasonably and includes a free credits pack for new users, which allows you to run a free automated test for 100 respondents. Cheers!

Join the Discussion

Asterisks (*) indicate required information.