What does actually happen in the background when you ask a chatbot or voicebot a question and get an answer? In order to understand the technical processes, you first need to know what intents, utterances, responses and entities are.

The responses of a bot are organized in intents

An intent is a user intention stored in the system. It consists of the name of the intent, the question variations(Utterances) and of course theresponse:

Utterances and a Response of an Intent

As you can see in the illustration, in the field with the utterances there are several variations of the question "How is the weather in Würzburg?", but in the field with the response there is only one sentence. Therefore, the chatbot will give out the same sentence every time the intent is hit.

The creation of several utterances is necessary because every user can formulate their question a little differently. Usually one response is sufficient as an answer. But you can also create variations here. Whether these are useful depends on the corresponding use case. The total number of all intents in a chatbot depends on the use case as well.

What types of intents are there?

Intents can be divided into two different types: Intents, where the response of the bot is always the same, and intents, where the response has to be customized.

For example, the answer to the question "How long does an internship last?" is always "An internship in our company lasts 6 months". It is irrelevant whether the user is in Berlin or Munich and whether the question is asked today or tomorrow.

However, the answer to the question "How is the weather?" cannot always be the same. The weather changes daily and is also location dependent. The bot must therefore be able to detect where the user is in order to present information adapted to the current weather situation.


Entities help the bot to play out the right information

If we now take a closer look at the utterances, we will find that certain words have a key function. They allow the bot to play out the right information in the response:

These keywords are called entities. With their help, the chatbot can recognise which information it needs to access. The first entity "Weather" signals to the bot that it needs to access a database such as Open Weather to retrieve the weather forecast. The second entity "Würzburg" tells it for which location it needs the forecast.

Procedure of the response process of a chatbot or voicebot

What happens now, if users ask the chatbot the question "How is the weather in Würzburg?

The chatbot compares the request with all utterances in the system and calculates a confidence score for each intent. This score indicates the probability that the answer is correct for the respective intent. In our case the Confidence Score is highest for the intent "Weather".

The system then checks the request for possible entities:

Because of these two entities, the chatbot knows that it needs to access a database like Open Weather Maps and retrieve the forecast for the city of Würzburg. This information is then built into the response:

The words

This answer is then displayed to the users.

With a voicebot there is an additional step at the beginning and at the end: The request entered by the users via spoken language is converted into written text by speech recognition software (STT =Speech-to-Text) before it can be analyzed. The answer of the chatbot is converted into spoken language by a text-to-speech software.

You want to know more about chatbots and voicebots? Then download our infographics Chat- and Voicebots for beginners.