Creative Ways to Use Unmoderated User Research

By Demetrius Madrigal and Bryan McClain

Published: August 9, 2010

“There are a number of ways you can use unmoderated user research tools that can provide a great deal of value.”

Over the past year or two, unmoderated usability testing has become a popular option to help guide product design. It is especially popular for Web sites, providing startups the opportunity to get relatively quick-and-easy user feedback on design iterations. From a user research perspective, the improper use of unmoderated research services presents a certain amount of danger. However, there are a number of ways you can use unmoderated user research tools that can provide a great deal of value. This month, we’ll discuss some of the more interesting ways in which you can derive value from unmoderated research tools.

One caution—When considering doing unmoderated user research, it’s important to keep in mind that unmoderated user research is never as good as moderated user research. You should always avoid attempting to replace necessary moderated user research with unmoderated user research.

One huge fallacy we sometimes encounter is the belief that some user research is always better than none. Unfortunately, this is completely untrue. Improperly conducted user research can lead to bad decisions about product direction that can result in your inaccurately defining a product’s target market, defining the wrong key functionality for a product, or designing poor user interfaces. Each of these issues is enough to doom a product to failure when you release it to the market.

To compound this problem, often decisions that are based on the findings of user research—regardless of its soundness—receive more trust than they deserve, so they are less likely to be challenged and corrected than if you’d conducted no user research. There are many ways in which user research can go wrong, but we’ll save that for another column. For now, we’ll focus on ways of making good use of unmoderated user research tools.

Beyond Usability Testing

“While unmoderated user research does not replace moderated user research, it can be very effective in augmenting moderated user research.”

Unmoderated user research tools tend to focus on usability testing, but there’s no reason why you can’t use some of these tools for performing unmoderated concept testing or even miniature ethnography studies. For example, you could construct tasks along the lines of Please demonstrate how you would make a purchase from your favorite online store.

While unmoderated user research does not replace moderated user research, it can be very effective in augmenting moderated user research. For example, generative user research such as ethnography can be extremely costly, but a company can hold down costs by performing ethnographic research with fewer participants, then supplementing their data through unmoderated research sessions.

When performing user research, we look for trends. It’s very important to distinguish between behavioral trends and idiosyncratic behaviors when determining design recommendations. Distinguishing between trends and idiosyncrasies requires many participants—a major factor affecting schedule and budget. Unmoderated user research can be an effective and low-cost method of obtaining the data that lets you make this distinction. You can use moderated sessions to identify and thoroughly understand the behaviors that are of interest. Then, to verify the trends you’ve observed, look for those same behaviors in unmoderated sessions. It’s best to follow this rule: Do not use the unmoderated sessions to identify additional behavioral trends, because the understanding you can glean from an unmoderated session tends to be superficial.

Combining limited ethnographic studies with unmoderated user research isn’t as effective as doing a full ethnographic study, but it is a way for cash-strapped startups to get some invaluable consumer insights. This approach of augmenting your moderated user research by involving larger numbers of participants through unmoderated sessions works with nearly any form of user research.

Longitudinal Usability Studies

“Unmoderated user research enables some innovative approaches to usability testing. One of these is longitudinal testing.”

Unmoderated user research enables some innovative approaches to usability testing. One of these is longitudinal testing. Using a longitudinal approach to usability studies, we can learn how a person forms a long-term relationship with a product by getting data from people over an extended period of time.

This approach lets us explore different levels of usability. Most usability studies that evaluate iterative designs focus heavily on discoverability, but a longitudinal study can also acquire data about the learnability and ultimate usability of a product once a user has become fully accustomed to its user interface.

To get a more complete picture of a product’s user experience, it’s useful to pair your data from longitudinal testing with another longitudinal data source such as diary data. The unmoderated testing sessions provide data that is similar to diary entries describing interactions, while the diary data indicates participants’ goals in using a product, their perceptions of the product, and the nature of their relationship with the product. For example, diary data might indicate that usage is extraordinarily high at first, but then dies out as the product loses its novelty, indicating a need for continuously updated content. Conversely, diary data may indicate low early usage, followed by an explosion in long-term usage, indicating usability issues affecting adoption.

For longitudinal unmoderated usability testing to be effective, it has some special requirements. First, the automated testing software must allow you to select your own participants, so you can test the same people multiple times. Second, you need to determine a testing schedule that would allow you to observe the factors affecting change. For example, to obtain data about learnability, you must schedule a test session during the critical period when a participant might be having difficulty learning a new user interface. Too early and you would be testing discoverability; too late and the participant would already have learned the user interface. Determining the proper frequency of testing depends on a participant’s frequency of use of the product. Thus, a higher-use product requires more frequent testing.

It might be helpful to give participants some basic structure for how they’ll use the product during testing—for example, engaging with the product for at least 30 minutes each night. However, the trade-off for this kind of structure is your ability to examine participants’ organic usage of the product. This kind of study can provide some amazing data—of a kind that rarely gets captured at present—but you’d need to tailor your testing method to each product.

Conclusion

“Since it’s untrue that some user research is always better than none, it’s extremely important that you should become a conscientious and discriminating consumer of research.”

This column has described just a couple of creative, innovative ways in which you can use unmoderated user research tools. We’ve described how by using unmoderated user research tools

  • you can augment the data you’ve acquired through moderated research sessions, reducing the number of participants you need for your moderated studies when identifying trends
  • you can perform research that requires repeated measures—for example, longitudinal studies that explore the long-term relationship between a user and a product

However, there are some constraints on the use of unmoderated user research tools that you should be aware of. For one, these tools currently work only for computer-based software. Therefore, they may not be useful to anyone developing software for a mobile platform or for hardware. At some point in the future, these tools may become available for mobile platforms, but we are unaware of any such solutions at present.

Unmoderated user research sessions are always inferior to moderated sessions—all other factors being equal—so use them wisely. And since it’s untrue that some user research is always better than none, it’s extremely important that you should become a conscientious and discriminating consumer of research. As we mentioned in our column last month, a little education can be extremely valuable in this regard. A thorough education in research methods also enables you to innovate new research methods that can open up new ways of understanding users.

The two applications of unmoderated user research tools we’ve described in this column illustrate the general concepts behind combining unmoderated sessions with other forms of data to get a more complete picture, but there are likely other effective methods, too. We invite you to share your ideas for other useful methods of using unmoderated user research tools in the comments.

9 Comments

Great article, guys. We get asked about this a lot, and I agree augmenting moderated with unmoderated and longitudinal studies are the best ways to do it. The only thing I’d add is that stakeholders and large teams love unmoderated additions to moderated studies, and it can be amazing ammunition.

Thanks for the interesting article. I’m not totally clear if, in referring to unmoderated studies, you’re referring to small N studies—as with UserTesting.com—or large N studies—as with Keynote. Since they really offer very different types of findings, you would compare them with moderated—qualitative, small N—studies very differently.

In thinking about the small N case, I find myself slightly hung up on content in the 7th paragraph, which starts out with the statement: “When performing user research, we look for trends.” I like the distinction you make between behavioral and idiosyncratic trends—I agree they are very different, but maybe for different reasons. Specifically, when looking at behavior through moderated studies, it is easier to see why that behavior is taking place—since you can probe or see more of the overall interaction in a moderated situation. And knowing why is much more important than seeing a “trend,” in my opinion. This makes sense with behavioral insights, but not with idiosyncratic ones—such as preferences—knowing why people like blue is only so helpful when N is small.

So many user researchers focus on seeing trends to decide whether to report an issue, and the problem with this is that it puts the researcher and ultimately the consumer of the research into a mindset of evaluating the validity of the findings based on the number of participants that had the problem. These are not quantitative studies, and validity should not be determined through statistical rules. The validity of a finding in qualitative user research is based on the nature of the behavior observed—for example, did the problem stem from reasonable previous behavior and understandable assumptions about how the product worked by a representative member of the target audience. There are published accounts of findings where N=1, which are totally valid in qualitative studies.

Where this relates to your article is that it’s much harder to observe important aspects of the user experience in an unmoderated approach, so that kind of validity can’t be easily established.

This leads me to consider the unmoderated studies where N is big—for example, Keynote or Vividence-like studies. In this case, you trade away the moderation specifically in order to get the big numbers, so you can do statistical tests. This is a totally different method of determining validity, and though it doesn’t give you the why, you get the what—assuming you’re measuring the right thing—and how much. That seems very complementary and different, rather than additive, when compared to moderated studies. Further, you might use such an approach in order to get a broader spectrum of your target audience, which you might not be able to conveniently recruit or access from your corporate headquarters.

Thanks for sending this out. I’d be curious to hear from other practitioners what tools they are using and how they are working for them.

Thanks, Nate, we really appreciate your perspective in this area.

Also, thank you, Christian. We definitely feel that there are areas in which a researcher must exercise judgment when deciding which findings to report. Trends are an easy concept for people to understand and the vast majority of reported data are trends. On the other hand, just as you pointed out, qualitative studies shouldn’t be treated like quantitative studies, in which inferential statistical analyses are applied. There are definitely findings that can be derived from a single user that should be reported. For example, in ethnographic research this might be in the form of a pain point or area of opportunity. In usability testing, it could be an issue surrounding core functionality. When doing so, it’s important to be mindful of your own biases to ensure that you aren’t just reporting a finding that reflects your own preconceptions.

In any case, it’s up to the researcher to decide whether they represent real concerns or idiosyncrasies. Determining how to handle these kinds of findings can be tricky, and it usually takes an experienced researcher to sort them out from the noise. Typically, why a behavior takes place can inform the process by letting you assess the impact of the underlying cause. This is an area in which moderated studies really establish their value, because they allow a researcher to interact directly with a user to understand all of the factors underlying an issue. This information cannot be gained from unmoderated sessions, but it’s also much more useful for deriving design direction, which is why we recommend treating data introduced through moderated and unmoderated sessions differently.

Obviously, this is a complicated endeavor—something I think we should address more fully in a follow-up column.

Thanks,

Bryan

When I read… “One caution: When considering doing unmoderated user research, it’s important to keep in mind that unmoderated user research is never as good as moderated user research”… I’m sorry but I’m already turned off. I actually feel like saying the following:

One caution: When considering doing moderated user research, it’s important to keep in mind that moderated user research is never as good as unmoderated user research.

My point is: Never as good for what exactly? I don’t think you can compare them just like that—so superficially. What is each good for? What are the researcher’s goals? When—as in what scenario—is the research taking place?

I think Christian makes a hell of a point and way better than I could explain it. But after having done tons of studies—both moderated and unmoderated and a lot of time combining the two—I can assure you that each has its strengths and limitations, and I would never say one is better than the other.

If I may suggest, you could say that moderated studies allow you to better get qualitative findings, better understand users’ points of view by interacting with them more and looking them in the eye. Furthermore, you could easily argue that unmoderated research allows you to better quantify issues, reach a broader audience from geographically dispersed areas, obtain behavior data on specific tasks far more cost effectively, and, in many instances, obtain more honest and direct feedback from users who do not have a stranger in front of them.

So yes, I’d say it can be a complicated endeavor and that’s why it takes some experience to understand the difference and, most important, how both methods beautifully complement each other. It’s not a matter of how much better one or another is. I would not try to conclude that or simplify it that way. If there are so many solutions out there, it is because the market is demanding it. One thing I can certainly conclude is that:

Moderated usability testing in the lab is simply not enough.

At least not for many researchers and marketers in today’s highly competitive online marketplace.

Thanks for the article anyhow. Other than this initial “caution” and the ambiguous “inferior” adjective at the end, I think it’s good.

Alfonso

Hi Alfonso,

We think it’s important to keep in mind that there are many different ways to conduct moderated research, and usability testing in a lab is not the only form of moderated user research. Moderated studies can be conducted remotely to address a geographically dispersed audience. They can be conducted contextually by visiting a person’s home. A trained researcher can acquire quantitative data from a moderated study just as easily as from an unmoderated study. A moderator can also observe a participant’s behavior and ask questions about areas in which their statements and behavior don’t seem to agree. And a researcher that communicates effectively and builds rapport with a participant will not feel like a stranger, but a friend.

The reason we say moderated research is better than unmoderated research is because moderated research can do anything unmoderated research can do, with the additional flexibility and depth brought by the presence of a trained researcher. The trade-off is cost, as you mentioned. Unmoderated research is much more cost effective. Thus, the specific research goals for a study drive the decision to use moderated research, unmoderated research, or a combination of the two. If one’s research goals are not overly ambitious, there is no need to incur the extra cost of a moderated study. We absolutely agree that unmoderated research is very good at accomplishing the goals you mention and also that cost is a major factor in determining what research approach to take.

I agree that the two methodologies—moderated and unmoderated—are important for very different end goals. Moderated, face-to-face research allows direct observation of participants and in-depth questioning. It also allows the exploratory topics to be less defined from the outside, as contextual questioning is a possibility. I would undertake remote, unmoderated testing, with a tool such as Userzoom if I wanted to understand the scale of issues or to understand behavior and opinions from a large audience. With unmoderated testing, you need a tool that allows you to aggregate and splice the data from all the participants to be able to analyze behavior and opinions; it is with this functionality that these tools come into their own. (Tools such as usertesting.com are less useful, as they have a small sample size and offer no distinct advantages over face-to-face, moderated usability testing.)

I see these methodologies as being complementary rather than saying that one is better than the other. Researchers have a whole tool set from which to design the research solution that best fits whatever problem they are faced with; there is no one methodology that is better than others. I also feel that both moderated and unmoderated research are subject to flaws. If either type of research study is designed or conducted incorrectly, the data produced will be inaccurate.

Hi Demetrius,

I see your point and agree for the most part, but also disagree. I’d start with an easy one:

“…a researcher that communicates effectively and builds rapport with a participant will not feel like a stranger, but a friend.”

Sure thing. I’m a big defender of Lab studies, so don’t get me wrong on this one. You would agree, however, when I say there’s clear value in having a participant take part in a study ‘in his pajamas,’ using his own PC, with no one—as friendly as can be, always a stranger—in front of him. I feel that, as with any other research method, there are positives and not-so-positives with this.

Let me get to the part I strongly disagree on. Now, please keep in mind that I’m thinking of unmoderated, remote usability testing methodologies like UserZoom, Keynote, and a few others do it.

You say: “A trained researcher can acquire quantitative data from a moderated study just as easily as from an unmoderated study.”

Yes that’s right, but that researcher will not get statistically significant results that will yield solid conclusions. With UserZoom, you typically test with hundreds of users, providing you with that very important statistical significance.

You also say that “A moderator can also observe a participant’s behavior and ask questions about areas in which their statements and behavior don’t seem to agree.”

Once again, in a lab, you could gather behavior data as well, but would you be able to come up with the most common path taken on a per-task level and feel secure about it? Would you be able to say to a customer that users who used the search engine to complete a search task were 5 times more likely to succeed with that task than the ones who used the navigation menu?

The fact is that one of the strengths of unmoderated, remote usability testing methodologies is how they combine Web analytics with usability metrics. This is key. It’s also why it goes along so well with lab studies. They each concentrate on really valuable things! I’d agree that sometimes the combination of both is not needed, but that’s a different subject, I think. Once again we’d go back to the point of “it all depends so much on the research goals.”

So I guess my overall point is that you can’t consider one either better, superior, or an alternative for the other. The best way to go is to truly understand the different ways to use unmoderated, remote usability testing and also, if possible, to combine the two methods. You’ll get extremely valuable data that way.

Last, I’d like to invite you guys to try UserZoom out and see what I mean. Let me know your thoughts by sending me an email to alfonso at userzoom dot com.

Best, Alfonso

Oh! I forgot to add something: I invite you all to see the Webinar we hosted in June, precisely about all of this. It was called “Combining Lab and Online Usability Testing: Lessons Learned.”

Hope you enjoy it!

Hi Alfonso,

Thanks for the response. I assure you that a researcher can definitely get statistically significant findings from moderated sessions. We’ve done it many times in the past. Typically, it’s not necessary to test with hundreds of users for statistical significance, unless you are trying to identify factors that have a very small effect size, or you are sampling a very heterogeneous population.

Again, it relates primarily to the cost rather than the quality of the research methodology or the data. I think it’s important not to mix up cost and quality. I’d go further to say that I think there are certain inferential statistics that are appropriate for unmoderated testing. These include analyses like t-tests, single-factor ANOVAs, chi-squared, and possibly lag sequential analysis. More complex analyses such as multiple regression correlations, stepwise regression correlations, multi-factor ANOVA, or other analyses resulting from a multifactor or multivariate design are usually better conducted using a moderated approach, in order to control or, at least, monitor those different elements.

Of course, both moderated and unmoderated approaches are effective tools with certain advantages for different applications and situations.

Join the Discussion

Asterisks (*) indicate required information.