Although the acceptance of chatbots is growing among customers, they still often give the impression of providing a poor user experience. The reason for this is that the full potential of the technology can only be realized with continuous and targeted optimization. Therefore it is essential to think about how to evaluate the performance of chat or voice bots before going live. However, most companies have not defined any targets against which they can measure the success of their bots.

This blog post will therefore describe established goals and KPIs from chatbot projects. Because in most cases the intentions in the companies are similar. Usually, they include goals such as the internal influence and efficiency of such systems. But user acceptance and customer satisfaction also play a major role.

Internal impact and efficiency: Do virtual assistants help us to save resources and work more efficiently? 

The following KPI's can be used to determine whether the use of a virtual assistant can save resources or make work processes more efficient:

Number of active users

How many individual users have interacted with the chatbot? It should also be measured whether the number of customers with whom employees communicate has been reduced through the use of chatbots.

Number of conversations/sessions handled by the bot

A user can have several conversations or sessions with a virtual assistant. Analytical systems count a new session after a certain time of inactivity (e.g. 15 min).

Human Handover Rate

How often does the chatbot transfer the conversation to an employee? How often do users explicitly ask for a real person to talk to? The better the virtual assistant is trained, the less conversations have to be handed over to a real person.

Percentage and number of default fallback intents/ error messages

This metric shows how often the chatbot was unable to provide the desired response. Often answers like "I'm sorry, I didn't understand that" are sent. If the error message rate is very high (20% or more), virtual assistants need to be trained to check if the required information is available.

Number of messages processed outside business hours

Being able to offer customers good service outside business hours is a clear competitive advantage. Therefore it should be measured how many requests the chatbot can answer independently at these times.

Feedback on discussions outside business hours 

At the same time, direct feedback from users during this period should also be specifically monitored. For example, if users write that the information provided by the chatbot was not helpful, the responses may need to be revised.

Ø-Confidence Score in total or per interview

Overall, how confident is the system in giving correct answers? The Confidence Score describes whether the system was 100% confident, the answer fits the question or only 20%. The more training data the system has, the more confident the answers will be.

Number of agent interactions through active intervention

The sentiment score (see below) can be used to determine whether a conversation with the chatbot is going well. If this is not the case, employees can proactively take over the conversation and thus prevent an annoyed user from leaving the conversation.

Degree of achievement of objectives

This metric provides valuable information about the success rate of a particular action performed using the chatbot. For example, if the user has successfully completed a certain process such as an order, cancellation or change of master data. In other cases, calls to specific target pages can also be measured.

Ø Duration of the call

How long does the virtual assistant take to give users what they were looking for? This metric allows to evaluate the average duration of interactions between the chatbot and its users. The value will vary greatly from case to case: a chatbot that solves more complex processes will need a much longer dialogue than one that gives simple FAQ answers. This KPI helps to quantify the time saved by customers and employees. If the goal is also to achieve higher user acceptance, you will learn how much time users spend with the virtual assistant.

Number of users/ sessions per channel

Virtual assistants can run on multiple platforms simultaneously. For example, a chatbot can be integrated on the homepage as well as on the Facebook Messenger channel. In addition, the content can also be transferred to language assistants. Here it should be measured on which channel most users interact and to what extent the type of interaction differs. For example, is the degree of goal achievement higher?

Customer satisfaction: Do chatbots help our users solve their problems? 

Whether users find a chatbot or voicebot helpful depends on whether it was able to help them solve their problems. The following three KPIs allow companies to assess how helpful users find their chatbot.

Net Promoter Score (NPS)

Would users recommend the virtual assistant to others? This question is also used by default at various other touchpoints to measure customer satisfaction.

Ø-Sentiment-Score total or per interview

This score ranges from -1 (dissatisfied users) to 1 (very satisfied users) and provides information about the mood of the users with the help of machine learning and text analysis. For example, if insults are used against the chatbot, the system reacts with a negative sentiment score. Human agents can intervene accordingly and de-escalate the conversation.

Exit Feedback Surveys

Here, the overall effectiveness of the bot is evaluated from the perspective of the user experience. Users are asked to rate their experience with the chatbot. This provides a company with valuable information on the quality of the virtual assistant. This can be answered by questions such as "Was this answer helpful?", "Could I solve your problem? - Yes or No" as soon as a certain point of the conversation is reached.

User acceptance: Do our users want to be helped by virtual assistants? 

Virtual assistants offer users many advantages. However, not all users want to communicate with a chatbot. How well it is accepted can be determined with the help of the following KPIs:

Insult rate

This shows how often users insult the chatbot. To do this, a dedicated intent with training phrases of different example insults must be created. If this intent was played disproportionately often, the cause should be searched for.

Ø number of messages per conversation

This indicator can be used to determine how many questions are asked before the chatbot outputs the requested information. It also describes how much time users have spent writing. It should be noted, however, that the ideal number of necessary questions varies considerably depending on the use case.

Initiation of the human handover within a certain number of user messages

If users request to speak to a real employee immediately, this can mean that acceptance is very low because they do not want to interact with the bot.

Exit Rates

Show at which point in the conversation path the users have left the conversation and clarify the areas where the bot loses the attention of the users.

Ø-Length of messages

The shorter the queries are, the more likely users are to be prepared for a conversation with a virtual assistant. Too long messages with different contexts are more difficult for the system to understand and process.

Retention rate

How often do users return to the chatbot within a period of time? This KPI describes how successful interactions with the virtual assistant have been. If methods such as notifications in messaging platforms or ads are used to get users to interact with the chatbot again, this should be considered in a differentiated way. The number of users who come back on their own should be tracked. If the chatbot offers real added value, the probability is all the higher.

If a chatbot is not well received by users, this can have various reasons: For example, the target group may not yet be familiar with the use of virtual assistants. Or the chatbot has been poorly trained and the use is tedious and frustrating. A lack of user acceptance does not necessarily mean that the use of a chatbot makes no sense.

There are countless metrics or KPIs that ultimately provide information about the success of chat and voice bots. It is important to be clear right from the start which problem is to be solved and which goals are to be achieved. Only on this basis can the right metrics be selected. Because the recipe for success lies in optimization. Really good chatbots are created when those responsible invest resources in the maintenance and operation of the virtual assistants. After all, a one-off development is not enough. But the good thing is that training a chatbot is only very costly at the beginning. Over time and from a certain training level on, the effort is reduced immensely.

This and much more is shown in our white paper "3 reasons why chatbots fail" and treated in more detail.