Download the full CDD Playbook - no email required.

Includes 5 guided activities that help conversational AI teams adopt conversation-driven development, and build the assistants users want.

Get the Playbook

It could be the most important question conversational AI teams ask: is the assistant really helping users?

Teams typically look at two types of metrics to answer that question: direct measures and proxy measures. Direct measures are often what we think of as high level stats about the assistant and its performance. For example, teams can track the number of users and conversations the assistant handled, or the session length, or the average number of messages sent in a conversation. Digging one level deeper, you could also look at the frequency of intents to understand which requests or actions are the most common.

Proxy measures of whether an assistant helped a user are just as important, if not more. User behavior that indicates the interaction went well are often good proxy measures. For example, you might track whether or not a customer contacts support a second time within 24 hours, or clicks a link or CTA, or rates a positive NPS score from an exit survey. These actions don’t necessarily happen within a session with an assistant, but they can tell you a lot about whether the assistant solved the user’s problem.

Conversation-driven development emphasizes that effectively tracking success rates means looking beyond direct measures and top-level statistics. Proxy measures and insights learned from reviewing conversations provide a more complete picture of how well the assistant is meeting user needs. Too often, teams focus exclusively on top-level metrics and miss the understanding that can be gained from indirect metrics.

Use this play to align your team on the metrics that you should be paying attention to. Repeat this play in 3 months, 6 months, 12 months to reassess the metrics you track.


Play 5:Identify the Metrics that Matter

Post-it notes and markers for each member of the group
Whiteboard or large sheet of paper

1-2 hours

5-8 team members

Step 1: Brainstorm questions

Ask participants to write down questions about the assistant’s success rate on individual Post-it notes. Examples could include whether a user would choose getting help from the assistant over other methods, the number of contacts deflected, or how many users complete their goal within a session. Set a timer for 5 minutes while team members brainstorm questions individually.

Step 2: Rank the important questions

Collect all of the Post-it notes. The facilitator reads each note and places it on the whiteboard. Similar questions should be grouped together to track duplicates. Sort the questions according to which have the most duplicates—these can be considered “votes.” Rank the top 5 questions according to the number of duplicate “votes” and group discussion. A question with few duplicates may be promoted to a higher rank if the group agrees it has merit.

Step 3: Establish measurable answers

Move all question Post-its except the top 5 to an out-of-the-way area of the board, designated the “parking lot.” Ask participants to brainstorm measurable ways to answer each of the top 5 questions and write down each idea on its own Post-it. Set a timer for 5 minutes while the team brainstorms.

Step 4: Balance direct metrics with proxies

On your whiteboard or sheet of paper, divide the space into two halves: Direct Metrics (measured inside the assistant) and Proxy Metrics (measured outside the assistant). The facilitator collects all of the Post-its from Step 4 and reads the ideas out loud. They sort each idea into the appropriate section of the board, depending on whether it’s a direct metric or a proxy.

When all ideas have been sorted, consider the balance of proxy to direct metrics. Are ideas skewed toward one area of the board or another? Discuss the outcome as a group and make sure the session has generated ideas for both direct and proxy metrics.

Step 5: Summarize your findings

Document the session by taking pictures of the board and summarizing the brainstorming session in a place where it can be accessed across the organization, like a company wiki. Identify next steps for instrumenting the tracking you identified in your assistant.

Discussion Questions

  1. Is there agreement or disagreement about which metrics are most important across different roles?
  2. For the metrics you’ve identified, do you currently have a good baseline for comparison?
  3. What is the process for reviewing reports and translating insights into actionable tasks?
  4. Which external systems and integrations might contain clues about how well the assistant is performing?
  5. How do user goals and journeys contribute to selecting success metrics?
  6. How do business goals contribute to selecting success metrics?

Next: Conclusion

Back: Fix and Test