Although the acceptance of chatbots by customers is growing, they still often appear to offer a poor user experience. The reason for this is that the full potential of the technology can only be realised with continuous and targeted optimisation. For this, however, it is essential to think about how the performance of the chat or voicebots can be evaluated before they go live. However, most companies have not defined any goals against which they can measure the success of their bots.

This blog post will therefore describe established goals and KPIs from chatbot projects. Because in most cases, the intentions in the companies are similar. As a rule, they include goals such as the internal impact and efficiency of such systems. But user acceptance and customer satisfaction also play a major role.

Internal impact and efficiency: Do virtual assistants help us to save resources and work more efficiently? 

The following KPI's can be used to determine whether the use of a virtual assistant can save resources or make work processes more efficient:

Number of active users 

How many individual users have interacted with the chatbot? It should also be measured whether the number of customers with whom employees communicate has been reduced through the use of chatbots.

Number of conversations/sessions handled by the bot 

A user can have several conversations or sessions with a virtual assistant. Analytical systems count a new session after a certain time of inactivity (e.g. 15 min).

Human Handover Rate

How often does the chatbot hand over the conversation to a staff member? How often do users explicitly ask for a real person to talk to? The better the virtual assistant is trained, the less conversations need to be handed over to a real human.

Percentage and number of default fallback intents/error messages

This metric shows how often the chatbot was not able to give the desired answer. Often, answers such as "I'm sorry, I didn't understand that" are sent. If the error message rate is very high (20% or more), virtual assistants need to be trained and checked to see if the required information is available.

Number of messages processed outside business hours 

Being able to offer customers good service outside business hours is a clear competitive advantage. Therefore it should be measured how many requests the chatbot can answer independently at these times.

Feedback on calls outside business hours 

At the same time, direct feedback from users during this period should also be specifically monitored. For example, if users write that the information provided by the chatbot was not helpful, the responses may need to be revised.

Ø-Confidence Score overall or per conversation 

Overall, how confident is the system in giving correct answers? The confidence score describes whether the system was 100% sure that the answer matched the question or only 20%. The more training data the system has, the more confident the answers will be.

Number of agent interactions through active intervention

The sentiment score (see below) can be used to determine whether a conversation with the chatbot is on target. If this is not the case, employees can proactively take over the conversation and thus prevent an angry user from leaving.

Degree of target fulfilment 

This metric provides valuable information on the success rate of a specific action performed with the help of the chatbot. For example, if the user has successfully gone through a certain process such as an order, cancellation or a change of master data. Alternatively, calls to specific landing pages can also be measured.

Ø Duration of the conversation 

How long does it take the assistant to give users what they were looking for? This metric allows to evaluate the average duration of the interactions between the chatbot and its users. The value will vary greatly from case to case: a chatbot that solves more complex processes will need a much longer dialogue than one that gives simple FAQ answers. This KPI helps quantify the time saved by customers and employees. In addition, if the goal is to increase user acceptance, one will know how much time users spend with the virtual assistant.

Number of users/ sessions per channel

Virtual assistants can run on several platforms at the same time. For example, a chatbot can be integrated on the homepage as well as on the Facebook Messenger channel. In addition, the content can also be transferred to voice assistants. Here, it should be measured on which channel most users interact and to what extent the type of interaction differs. For example, is the degree of target fulfilment higher?

Customer satisfaction: Do chatbots help our users solve their problems? 

Whether users find a chatbot or voicebot helpful depends on whether it was able to help them solve their problems. The following three KPIs allow companies to assess how helpful users find their chatbot.

Net Promoter Score (NPS) 

Would users recommend the virtual assistant to others? This question is also used by default at various other touchpoints to measure customer satisfaction.

Ø-Sentiment score overall or per conversation

This score ranges from -1 (dissatisfied users) to 1 (very satisfied users) and provides information about the mood of the users with the help of machine learning and text analysis. For example, if insults are used against the chatbot, the system reacts with a negative sentiment score. Human agents can intervene accordingly and de-escalate the conversation.

Exit feedback surveys 

Here, the overall effectiveness of the bot is evaluated from a user experience perspective. Users are asked to rate their experience with the chatbot. This gives a company valuable information about the quality of the virtual assistant. This can be done by asking questions such as "Was this answer helpful?", "Could I solve your problem? - Yes or No" once a certain point of the conversation has been reached.

User acceptance: Do our users want to be helped by virtual assistants? 

Virtual assistants offer users many advantages. However, not all users want to communicate with a chatbot. How well it is accepted can be determined with the help of the following KPIs:

Offence rate 

This shows how often users insult the chatbot. To do this, a dedicated intent with training phrases of different example insults must be created. If this intent was played disproportionately often, the cause should be searched for.

Ø-Number of messages per conversation 

This indicator can be used to determine how many questions are asked before the chatbot outputs the requested information. It also describes how much time users have spent writing. It should be noted, however, that the ideal number of questions required varies considerably depending on the use case.

Initiation of the human handover within a certain number of user messages 

If users demand to speak to a real employee immediately, this can mean that acceptance is very low, as people do not want to interact with the bot.

Exit rates 

Show at which point in the conversation path the users have left the conversation and highlight the areas where the bot loses the users' attention.

Ø length of the messages 

The shorter the requests, the more likely users are to engage in a conversation with a virtual assistant. Too long messages with different contexts are more difficult for the system to understand and process.

Retention Rate

How often do users return to the chatbot within a period of time? This KPI describes how successful interactions with the virtual assistant have been. If methods such as notifications in messaging platforms or ads are used to get users to interact with the chatbot again, this should be considered in a differentiated way. The number of users who come back on their own should be tracked. If the chatbot offers real added value, the likelihood is all the higher.

If a chatbot is not well received by users, this can have various reasons: For example, the target group may not yet be familiar with the use of virtual assistants. Or the chatbot has been poorly trained and using it is tedious and frustrating. A lack of user acceptance does not have to mean that the use of a chatbot does not make sense.

There are countless metrics or KPIs that ultimately provide information about the success of chat and voicebots. It is important to be clear from the beginning about the problem to be solved and the goals to be achieved. Only on this basis can the right key figures be selected. Because the recipe for success lies in optimisation. Really good chatbots are created when those responsible put resources into the maintenance and operation of the virtual assistants. After all, a one-off development is not enough. The good thing, however, is that training a chatbot is only very costly at the beginning. Over time and from a certain level of training, the effort is reduced immensely.

To make your work easier, we have compiled a checklist with the most important KPIs.