Reading time: 10 minutes
For the project work to be done for my studies of Content Strategy at the FH JOANNEUM, I decided to conduct usability tests of an internal service portal which is configured to offer the possibility to raise requests (e.g. for new hardware or software), to report problems, to seek support from colleagues in a forum and it provides access to a knowledge base.
After a theoretical introduction to the topic, I share my personal experience that may be helpful for anyone who plans to do usability testing.
Rubin and Chisnell (Handbook of Usability Testing, Second Edition: How to Plan, Design, and Conduct Effective Tests) provide the following description: “Usability testing […] employs techniques to collect empirical data while observing representative end users using the product to perform realistic tasks.”
Krug (Don’t Make Me Think! A Common Sense Approach to Web Usability, Second Edition) emphasises that only usability tests can reveal if a system/tool “really works” because they show how the users think, what they know and how they use the system/tool. Moed, Kuniavsky, and Goodman (Observing the User Experience, 2nd Edition) explain that a usability test “[…] helps identify problems people have with a specific interface and reveals difficult-to-complete tasks and confusing language.”
However, Rubin and Chisnell point out that “testing is neither the end-all nor be-all for usability and product success, and it is important to understand its limitations.” They mention for example that the test situation is artificial and that another technique such as to conduct an expert or heuristic evaluation may be more effective.
The testing toolkit comprises a variety of quantitative and qualitative methods that help to reveal issues as well as positive aspects concerning a product and that are applied at different points in the lifecycle of the product. Rubin and Chisnell mention methods such as focus group research, surveys, card sorting amongst others. Usability testing ” employs techniques to collect empirical data while observing representative end users using the product to perform realistic tasks”, according to them. They explain that they “are used in altered and combined form” depending on the specific project objectives.
Lang and Howell (Researching UX: User Research) point out that all methods have blind spots and if they are combined it is possible that their blind spots can be counteracted. Additionally, when it comes to choosing the approach, they recommend answering some questions and to decide based on the answers. One of their suggested questions is “Do I need qualitative answers (i.e., to understand things from a user’s perspective), or do I need quantitative answers (i.e., an idea of how many, how often or how much)? Or both?”. According to them, it may be best to use a multi-method approach, when the answer is “both”.
Rubin and Chisnell describe it as a simple technique to find out what the participants think while performing tasks. They are asked to express what confuses, frustrates or even delight them.
Rubin and Chisnell mention as one of the advantages of a thinking aloud test that “preference and performance information” can be captured simultaneously. Additionally, it seems that some participants can concentrate and focus better during such a test. Moreover, they speak out loud what they are thinking about performing tasks and thus it can be revealed why certain things work/do not work for them.
However, they are also aware that the technique has its disadvantages. Some participants experience the situation as “unnatural and distracting.” and they may need more encouragement to think aloud. Rubin and Chisnell also point out:” Thinking aloud slows the thought process, thus increasing mindfulness. Normally, this is considered a good effect, but in this case, it can prevent errors that otherwise might have occurred in the actual workplace. Ideally, you want your participants to pay neither more nor less attention to the task at hand than they normally would.”
Rubin and Chisnell describe flexible scripting as a form of testing that is based on a structured interview which is conducted by the moderator the beginning of a usability test. They state that this works well for usability tests which require the participants “to search for some piece of information or specific item on a website”.
Wilson (Interview Techniques for UX Practitioners) explains that for a structured interview a verbal questionnaire is used and that the script and the fixed pool of questions limit the interaction. The interviewer is supposed to follow a specific format with minimum deviation. The questions can be open or closed questions. The response categories for closed questions are standardised. The questions are the same for all participants and they are always asked in the same order.
Although Rubin and Chisnell state that the interview is conducted at the beginning of a usability test, they were done after the scenarios. The reasoning was that for participants who are not so familiar with the PSP it would be easier to state their opinion after using the tool.
First and foremost, it is important to conduct the tests in the system for which usability opportunities are searched for. For the tests conducted during my project work, the test system had to be used because “real” requests should be avoided. This was not ideal because there are differences between the test and the live system. For one scenario for example, a request form is available in the test system which no longer exists in the live system. General conclusions could still be drawn, but it is always better to test the system to be improved.
Phrasing of the Tasks
It is also important to consider the phrasings of tasks carefully because they could have an influence on how the participants perform it. It is crucial that no “leading” instructions are used. Basically, the user journey descriptions seem fine. However, I had the impression that the participants sometimes got a bit confused or even got off track because of them. One task for example included the phrase “use the knowledge base” and had the effect that one participant was led astray by it although she was already on the right way. In this case “use the portal” may have been less distracting.
Whenever a scenario includes more than one task to be accomplished, the individual tasks should be clearly indicated. Although the participants had a printed handout, they had to be reminded of the second part of the scenario.
Moreover, the number of scenarios should possibly be reduced. When counting the sub-scenarios as well, the participants had to perform eight scenarios which may be too much for one session. 4 – 5 scenarios would possibly have generated comparable or even the same results. It may be more convenient for the participants and would reduce the time needed for the analysis of the usability test recordings.
There are also a few details that should be thought of concerning the device that is used for usability tests. It may be better to conduct the tests with a screen larger than 12.5 inches because the screen size has an influence on the amount of detail that is displayed on a page. It may well be that the participants act differently if they are provided with all available options.
The language settings of the test device should be taken into account. An error message on my laptop popped up in German and distracted the participants who do not speak German. It is easy to provide a quick translation, but standardised English language settings would have allowed a smoother process for all participants.
Generally, any programs that could disturb the test procedure should be disabled. Grammarly was activated on my laptop and the icon confused the participants because they did not know what it is about. A moving, white bar in the background which was caused by the program, added to their distraction.
Another important aspect that should not be ignored, is the role of the moderator. For inexperienced persons, it can be a very difficult role because the moderator is supposed to keep a neutral facial expression and should not help the participants during the test. However, whenever the participants do not articulate their thoughts, the moderator should prompt them to think aloud by asking questions.
However, the questions need to be well considered. They should not judge the actions of the participants, should not be “leading” and they should not give the participants the feeling that they did something wrong. This seems to be easy in theory but in a spontaneous situation it is much more difficult to avoid such questions. Thus, I had a list of possible questions in the handout that I used during the tests. As it also contained the questions that should be avoided, the handout proved to be very helpful.
Generally, the situation felt very unnatural, but I am sure that the participants had the same impression. Thus, it is very important to make them feel comfortable which can be done easily with some small talk at the beginning of the tests.
For the follow-up interviews, it helped a lot that the interview guidelines had been created for the course Basics of Empirical Research and had been approved by the lecturer.
The interview questions should be phrased depending on what should be found out. As the interviews were conducted after the usability test to capture some general impressions from the participants, they mainly focused on positive and negative aspects but also included a rating and an open question. As the analysis of the interviews basically revealed the same aspects as the usability tests, the questions seem to have been appropriate.
However, the hardest part about the interviews was neither the creation of the questions nor the role of the interviewer but the interview analysis. First, the interviews need to be transcribed which takes a lot of time and should be done shortly after the interviews because then the memory is still fresh, and it may be easier to understand certain statements. Then, a theoretical framework must be chosen, and the approach must be applied to the data. I totally underestimated that part and would recommend planning enough time for interview analysis during any project.
Conducting usability tests and interviews sounds easier than it is. When done for the first time, there are many potential pitfalls and not all of them can be avoided. However, making mistakes is not a bad thing per se as long as something can be learned from them.
I learned a lot!
© Photos: Alexandra Wurian
 Rubin, J. & Chisnell, D. (2008). Handbook of Usability Testing, Second Edition: How to Plan, Design, and Conduct Effective Tests. Indianapolis, IN: Wiley Publishing, Inc.
 Krug, S. (2006). Don’t Make Me Think! A Common Sense Approach to Web Usability, Second Edition. Berkeley, CA: New Riders.
 Moed, A., Kuniavsky, M. & Goodman, E. (2012). Observing the User Experience, 2nd Edition. Waltham, MA: Elsevier.
 Lang, J. & Howell, E. (2017). Researching UX: User Research. VIC Australia: SitePoint Pty. Ltd.
 Wilson, C. (2014). Interview Techniques for UX Practitioners. Waltham, MA: Elsevier.
You may also be interested in my colleague’s blog post: