Why Measure How Users Feel?
Many of us in the field of user experience believe that utility and usability are necessary, but somehow insufficient. Even for the most staid and straightforward business application, users form an affective reaction to the application during initial and subsequent use. It might not be as strongly valenced as, say, the reaction they have when browsing an online store, but the affective reaction is there nonetheless.
All things being equal, users evaluate a system that engenders positive emotional reactions more positively than a system that doesn’t. So we need to know whether—and the extent to which—a system’s use triggers positively valenced emotions.
And on the flip side, we’ve all seen users become frustrated when they can’t figure out how to accomplish their tasks. Just how frustrated are they? Mildly? Moderately? Are they so irritated with your design they’re ready to heave the device out the nearest window? Let’s hope not.
The point I’m trying to make is that, up till now, we’ve assumed users are somewhat frustrated when we observe indirect behavioral indicators such as menu hunting, false starts, and input errors. And we’ve assumed they’re really frustrated when we’ve heard a sigh of exasperation. However, these observations provide really coarse measures of affect. And very few of us capture them or use them systematically to make comparative judgments.
What’s more, alternatives for measuring delight and frustration—after-the-fact survey questions, verbal self-reports, and retrospective video self-evaluation—are notoriously subject to positivity bias and other vagaries of attribution bias. Wouldn’t you like to have a method that’s more granular and valid when you test your next design?
How Do We Currently Measure Delight and Frustration?
Currently, the most prevalent ways of measuring a user’s delight or frustration when using a product or Web site are retrospective self-report measures such as:
- participant ratings of task satisfaction, task utility, system utility, and so on
- participants’ responses to open-ended questions about desirability, satisfaction, and utility
- repertory grid techniques that use semantic differences in word pairs to elicit participants’ evaluations of a user experience
- Benedek and Miner’s Desirability Toolkit, which lets participants choose descriptive words or phrases that reflect their evaluations of a product’s user experience
Our methods of direct behavioral observation have so far been limited to the capture and rating of participants’ verbalizations and utterances, reflecting delight or frustration, during their use of a product.
Recently, I learned about a new method for assessing users’ emotional response to a product—one that relies on real-time observation of behavior and coding of participants’ facial expressions and gestures. Its creators, Eva de Lera and Muriel Garretta-Domingo, call their method the “Ten Emotion Heuristics.”
The Ten Emotion Heuristics
According to de Lera and Garretta-Domingo’s conception, users’ emotions are intimately bound to their appraisals of a user experience. Therefore, accurately measuring users’ emotions while they are learning and using a product provides a window into the quality of the user experience.
Given the validity and reliability challenges of self-report measures, they have chosen to rely on their observation of the occurrence and frequency of certain facial expressions and body gestures as proxies for users’ affective reactions. For example, if while using a product, a participant frowned or raised her brow, they coded her expression as confusion, exasperation, or frustration. Not surprisingly, a smile indicated pleasure or delight.
In their recent publication, “Ten Emotion Heuristics: Guidelines for Assessing the User’s Affective Dimension Easily and Cost Effectively,” de Lera and Garretta-Domingo described the advantages of their method: It is both inexpensive to implement and easy to understand. Since their method derives from the seminal research by Paul Ekman and his colleagues, they are confident that this method will prove to be valid across cultures. They have not yet rigorously validated the method, nor do they have reference or baseline measures—as is also the case with Benedek and Miner’s Desirability Toolkit. However, their concurrent observations have strongly indicated the method’s validity, because negative emotional markers nearly always occurred when users made observable errors or encountered usability problems.
I recently had a conversation with Eva de Lera about the “Ten Emotion Heuristics” via instant messenger, as follows:
PJS: I was at the UPA 2007 conference this year and was intrigued to hear about your research project on the “Ten Emotion Heuristics.” Would you briefly summarize what they are and how you came to work on this problem?
EdL: The heuristics are a set of guidelines to help assess a user’s affective state in an easy and cost-effective manner. The idea originated from a need to gather objective satisfaction measures and also a need to have different people gather this data in similar ways. The heuristics are a set of ten emotional cues that we have identified, and we use them as one measure. For example, identifying four or five negative emotional heuristics at the beginning of a given task provides us with a negative user experience measure.