Evaluation

Approach

Participants were introduced to the study, provided informed consent, and presented with a smartphone running the SideWaze prototype application

A scenario was given to the participant to provide context. Next, the participant was asked to perform specific tasks. The participant attempted to complete the tasks, speaking aloud while doing so. If the participant encountered a part of the prototype that was incomplete or non-functional, or if they became insurmountably stuck, the administrator of the test stepped in to provide guidance.

The participants were not timed and were informed that the usability test was not a race.

During the usability test, notes were taken by the administrator if any of the following were observed:

positive or negative comments about the flow or app design (“Ooh, I like that…”, “I don’t like…”);
unexpected or unusual questions that hinted at a deficiency in the interface or preferred workflow;
the participant performed an action that was unhandled beyond the known limitations of the prototype.

The goals of testing this workflow were to identify where users were encountering problems or confusion; where they were experiencing delight or validation; whether the prototype represented a reasonable solution for them and their lives; and to gather additional feedback about the challenge and how it might be addressed (i.e., using the prototype and usability tests as a conversation starter).

Given the current state of the prototype, the well defined workflows, and the expectation that–even within the primary stakeholder group–there would be a diversity of impressions, a usability test was used to evaluate the prototype. This approach also meant that very little training was required for the participant, that the testing could be done in a variety of locations, and that the time commitment for each test was fairly low (approximately 30 minutes per participant session).

The current state of the prototype and its limited functionality outside of the scripted path was such that capturing task completion time was unrealistic.

Rather, for the first workflow–entering sidewalk condition data–using the system usability scale (SUS) allowed for a well rounded assessment of the usability of the prototype for this workflow. The SUS is approachable to users as it uses simple language and non-numerical evaluation.

For the second workflow–learning about sidewalk conditions–the feedback grid was chosen because of its conversational tone and flexibility. It allows the facilitator to gather feedback on four important aspects of system usability: what was liked, what could be improved, questions that were generated, and new ideas that could be evaluated. It also allowed for the participant to let the facilitator know what they really thought of a system, free from the confines of a Likert scale.

Participants

Six participants (n = 6) were engaged to test the workflow around submitting sidewalk condition data. Four participants were female and two were male. Participants ranged in age from 37 to 76, with a mean age of 47.17 years (s = 14.36).

Participants were chosen from the researcher’s network. Attention was paid to having representatives from several elements identified in the primary stakeholder group. In particular:

one participant was an older adult whose spouse is legally blind (and who uses sidewalks and public transit, with a cane, often);
two participants were parents of young children and frequent sidewalk users; and
all participants engaged in active transportation (walking on sidewalks or cycling, etc.) although this was not an experimental variable.

Tasks

Tasks were provided to the user one at a time after presentation of a scenario. Tasks exercised elements of the respective workflows.

For the sidewalk condition reporting workflow, evaluation was done using the SUS instrument. Observations and comments were clustered using an affinity mapping exercise.

For the sidewalk condition sharing workflow, evaluation was done using a feedback grid. Observations, comments, and feedback were clustered using an affinity mapping exercise

Results

As provided by study participants (Table 1), the prototype SUS scores had a mean of 88.3, with a range of 77.5 to 95.0 (s = 6.055). By comparison, an SUS score of 85 corresponds to an adjective rating of ‘excellent’ or a grade scale of a B (Brooke, 2013). Three of the six participants gave the usability of the workflow an A grade.

Table 1. Un-adjusted, per participant SUS scores. Ideal answers alternate by question (5 is ideal for odd questions, 1 is ideal for even questions). Green represents ideal, white neutral, red poor.

Table 2 aggregates all feedback provided by participants for each of the four categories.

Table 2. Aggregate enumeration of feedback provided by all participants using the feedback grid instrument.

Conclusion

The usability test data suggested that:

As a system, participants felt the prototype app was very easy to use, with a high average SUS rating.
The prototype app should acknowledge photos and videos being uploaded. All users identified this, either explicitly or by confusion, as a significant problem.
Presenting sidewalk conditions visually and spatially on a map resonated with users and allowed them to apply learned knowledge and patterns from other systems.
The use of flags to communicate sidewalk condition was not universally understood. Given they form a critical part of the information visualization, they should not be considered successful in their current form.
The reporting workflow may need to be simplified, as it seemed cumbersome to some users. This is especially relevant if we consider the need for high volume data entry by users.

Design implications

Considering the data gathered through usability tests, the following design changes should be considered:

Several options exist for the display of sidewalk condition information on a map (flags, segments, different colours/icons, etc.). Multiple, incremental designs should be explored, prototyped, and tested to move closer to a design that is more universally understood, more quickly.
All user actions should be confirmed in some way; this was a significant shortcoming of the prototype as tested. Platform guidelines (e.g., iOS guidelines, Android Material Design guidelines) should be followed or, in the absence of any guidance, comparable products should be examined to utilize learned knowledge (e.g., What does Google Maps do?).
The current prototype is not designed for high volume reporting. Future iterations should explore either forking workflows (e.g., beginner vs. advanced) or, if research supports it, separate systems. For example, if postal workers were recruited to submit sidewalk condition information, would they need a dedicated workflow? (e.g., speech interface, tactile interaction with an smartwatch, etc.).
A research question for this work was to determine if crowdsourcing could help with the collection and sharing of sidewalk condition data. After developing and testing the prototype app, users echoed this concern, worried about coverage and timeliness of the data; as well as the time required to submit a single report. Users felt that if coverage wasn’t timely and near universal, they might not be confident in the app (exclusively).
As a general rule, the size of interface widgets should be increased before further prototype testing. Users were observed struggling with hitting some smaller widgets (e.g., check boxes, ‘Next’ button).
Integration with other parts of the sidewalk ecosystem are key. This was hinted at by some users (‘who would update the status of an issue?’) but, from an architectural viewpoint, is critical for such a solution to be useful in a variety of situations and by different users, scalable across differently sized installations, and replicable across a country with its variety of municipalities.

Considered in isolation, the SideWaze prototype app fared well during usability testing (with the exceptions noted above). Revisiting the goals of this research puts the app into a different context, however. Should this work continue, a change in focus would be prudent: away from the end user technology to the systems, needs, and data flow of the supporting organizations (call centres, city departments, etc.).

This page summarizes the following document:

M5 – SideWaze – Matthew Reynolds Download