We welcome guest author Kate Haley Goldman, a consultant who has spent over 20 years listening to visitors in her work with museums – directing research, technology and learning projects in the US and abroad, in a two part series on museum evaluation.
The field of evaluation tends to get a bad rap. Like the words “engagement” or even “analytics” it can be difficult to know exactly what someone means when they speak of evaluation. According to one of the evaluation field leaders Michael Patton, evaluation is “any effort to increase human effectiveness through systematic data based inquiry” (1990). “Any effort” is a pretty broad net and that breadth sometimes engenders misperceptions. Evaluation is not simply a focus group, usability testing, or something you do at the end of project. Evaluation is not centering on finding fault. It does not need to be time consuming or costly.
When working with institutions that are trying to build capacity in evaluation, I emphasize that to truly make the results useful, a team needs to build an evaluative mindset. There isn’t one best way to do an evaluation, just as there isn’t only one way to craft an exhibition. Designing an evaluation requires blending of social science, an institution’s goals and needs, logistical planning, and an understanding of visitor behavior. The design of a quality evaluation needs to balance what the evaluation needs to find out, who the audiences are, what time and resources are available, and the unique elements of each institution. Above all evaluation needs to be useful. If an evaluation question can’t be used by the museum for decision making, we shouldn’t be asking it.
In the museum field, evaluation is divided into different types of evaluation, each with a different set of assumptions, questions, and methods.
1. Front end evaluation
Front end evaluation is done during the early stages of a project, during the discovery phase, research phase, or concept design. Front end evaluation allows the team to understand what visitors know about a topic, what questions and misconceptions they have, and what content resonates with them personally. Front end methods can include anything from in person interviews, card sorts, personal meaning mapping, persona creation, online surveys, and research reviews. At the moment, I’m doing front end narrative testing for one museum launching a new permanent exhibition and another creating a whole new building. In each institution, we’re conducting a series of narrative testing discussion groups with key constituencies to better understand the visitor perception of the material. This has allowed us to discover what visitors are curious about and where we need to develop deeper content.
2. Formative evaluation
Formative evaluation occurs during design development, but before actual launch of an exhibition or program. Often including but not limited to prototyping, formative answers such questions as does this content make sense to the visitor? Does it meet their needs? Methods here include paper prototyping, usability testing, remote user interviews, think aloud protocols, and more.
3. Remedial evaluation
Remedial evaluation occurs in that sweet spot during a soft launch. The project is launched or nearly launched, and we’re working the last kinks out. Common questions include – what is working well? What needs reinforcing or support? What needs changes? Are the visitors understanding? Enjoying? Methods at this stage include timing and tracking studies, focused observations, and exit interviews.
These first three types of evaluation set a project up for useful summative evaluation. Summative evaluation is about determining what impact a program, exhibition, or project had on visitors. While some individuals frame impact solely as learning, I prefer to take a broader approach, studying affective and experiential impact as well as cognitive impact. What do visitors know, think, believe, feel, or do differently because they have been to your exhibition? What are they taking home with them, whether it is a warm family memory, a new understanding of the ocean, or deep emotional reaction to a work of art.
Audience research stands separately from the other types of evaluation; it is conducted independently from a specific project. Institution wide it allows decision makers to better understand who an institution’s audience is, their needs and motivations, and patterns of visitorship. Audience research is at the heart of high quality visitor personas, timing of exhibition schedules, the baseline of understanding the ebb and flow of an institution. Typically done through a combination of random selection surveys and interviews, audience research should be designed to systematically capture a realistic set of portraits of a museums’ visitorship, including high seasons and low seasons, weekdays and weekends.
Audience research and summative evaluation different from the prior three in that they require a level of rigor to the research design, a balance of the methods, and care in the sampling that is not as crucial for these other phases. For prototyping, one gets useful information from each individual you work with. It can only take a few individuals before you know if a feature is confusing or broken, or conversely is enjoyable and engaging to use. Even the most rigorous website usability testing happens in rounds of 20 – 30 individuals. Audience research and summative evaluation requires a wider lens, so you can ensure the patterns of visitation you find are not relying on a biased sample, generating findings which are not replicable. If an institution lacks audience research or has changed dramatically recently such as opening a new wing, they should consider conducting audience research three – four times in a single year, in order to gather a solid foundation of seasonal data.