Select a page

Building an evaluation

We welcome guest author Kate Haley Goldman, a consultant who has spent over 20 years listening to visitors in her work with museums – directing research, technology and learning projects in the US and abroad, in a two part series on museum evaluation

Crafting an evaluation is a little bit science, a little bit art. Doing research design is one of my favorite parts of a project, and new evaluators are often startled to realize it is a creative endeavor in itself. Often I am contacted by folks who have started off thinking about methods. I’ll get a call looking for a focus group, or a survey. At that point, we need to step-back for a higher level view. What stage is this project at? Are we getting a baseline of visitor behavior (audience research) or is there a product about to roll out (formative or remedial), or is the project already live and the funder wants an update (summative)? Knowing this will help narrow down the questions, such as do we need to know if the interactives are usable? Or are we trying to gather visitor numbers to decide on a new ticketing system?

The second step is to determine the evaluation questions. This is *not* the questions we’ll ask the visitor, but the questions the evaluation will answer. A well done evaluation answers specific questions directly, and you’ll find the evidence for those answers in the evaluation report. To differentiate the two, I’ll call these overarching questions the central research questions. Creating these evaluation questions is the most challenging portion of conducting the entire evaluation, more than recruiting participants, more than data collection, more than analysis. Once you have these questions, everything about doing the evaluation will be easier.

Let’s look at how to construct the questions at the heart of your research design. Here are some rules of thumb:

Must be phrased as a question. This seems trivial, but it’s not. Making an issue into a question enable it to be concrete and answerable.
The central research questions should be clear and jargon free. Don’t use words or numbers that have more than one meaning in your institution. Be specific about what engagement or experience or underserved means to you and your institution.

There should be a maximum of no more than 5 central research questions. Too many questions and you will lose focus of the study, answering each of the questions poorly.

Institutions must know what they’ll do with the answer to those questions. What will you change in your exhibits or programs in answer to these questions? What decisions will be made?

For example, a poor central research question might be:

“Measure how visitors engage with the mobile app.”

A better central research question is:

“To what extent does using the mobile app change visitor experience in terms of stay-time, enjoyment, and content retention?”

We would then ask the museum what they will do with this information. Do they intend to make a stronger push to have all visitors use the app? Have visitors download the app before they arrive? To inform the next iteration of the app? To show the funder the impact of the app?

Let’s assume this museum would plan to push more visitors to use the app, if the results from this evaluation are encouraging. Taking the research question above, we can see how clarifying the central question leads us naturally to more appropriate methodologies.  First off, we can see that the museum wants to know the difference between the mobile app user’s experience and the experience of those that don’t use the app. So we need to get data from both these types of visitors. Alternatively, if the museum had some audience research already completed about stay time, enjoyment, or content retention, we could conduct a study where we compare against the prior data.

There are several methodological choices we could make at this juncture. The methods we choose will be based on each museum’s profile and resources. For instance, if the current pick up rate is very low, we might suggest we’ll need to use a cued methodology. This is the case in most mobile phone evaluations, as the pick up rate hovers under 5% it can be almost impossible to find enough people using the app to get worthwhile data. If we’re doing a cued study, we might next look at recruitment. Does the museum have paid admission? If free, does everyone stop at the information desk? This will help us decide how and where to do recruiting.

Conversely, we might ask how much staff or technology we’re able to devote to the evaluation effort. Measuring stay time accurately requires the entry and the exit time to be noted; visitors are quite poor at assessing how long they spend in an area. If the institution was using Dexibit to track flow, not only would they have a baseline stay time to measure against, you could construct a study that would measure the stay time of the visitors using the mobile technology.

The next issue to address is visitor enjoyment of the app. We could embed a question on enjoyment within the app. Or we could track the visitors and note when they appear to be enjoying the content. Or we could conduct an exit interview and ask them when they finished what they enjoyed about it.

Finally, we need to consider the resources available. Do we have the number of staff and the time available to complete a timing and tracking study? Or is this effort better accomplished a different way? After working through the questions that need to be answered, and then the context and needs of the museum, we can then more accurately and appropriately determine what method to employ. The next time you are part of a planning session for an evaluation, visitor feedback, or usability testing, ask yourself and your team what your evaluation questions are, and then start to plan your methods.