What actually happens in the background when you ask a chatbot or voicebot a question and receive a response? To understand the technical context, you first need to know what intents, utterances, responses, and entities are.

The responses of a bot are organized in intents

An intent is a user intention stored in the system. It consists of the name of the intent, the question variations(Utterances) and of course the answer(Response):

Utterances and a Response of an Intent

As you can see in the illustration, there are several variations of the question "What is the weather like in Würzburg?" in the utterances, but only a single sentence in the response. The chatbot can therefore only ever output the same sentence in response to the question.

Creating multiple utterances is necessary because each user may formulate his/her question a little differently. As a rule, one response is sufficient. However, you can also create variations here. Whether these make sense depends on the corresponding use case. The total number of intents for a chatbot also depends on the use case.

What types of intents are there?

Intents can be divided into two different types: Intents, where the response of the bot is always the same, and intents, where the response has to be customized.

For example, the answer to the question "How long does an internship last?" is always "An internship at our company lasts 6 months". It is irrelevant whether the user is in Berlin or Munich and whether he or she asks the question today or tomorrow.

However, the answer to the question "What is the weather like?" cannot always be the same. The weather changes daily and is also dependent on location. The bot must therefore be able to detect where the user is in order to present information adapted to the current weather situation.

 

Entities help the bot to play out the right information

If we now take a closer look at the utterances, we will find that certain words have a key function. They allow the bot to play out the right information in the response:

These keywords are called entities. With their help, the chatbot can recognise which information it needs to access. The first entity "Weather" signals to the bot that it needs to access a database such as Open Weather to retrieve the weather forecast. The second entity "Würzburg" tells it for which location it needs the forecast.

Procedure of the response process of a chatbot or voicebot

What happens now, if users ask the chatbot the question "How is the weather in Würzburg?

The chatbot compares the request with all utterances in the system and calculates a confidence score for each intent. This score indicates the probability that the answer is correct for the respective intent. In our case the Confidence Score is highest for the intent "Weather".

The system then checks the request for possible entities:

Because of these two entities, the chatbot knows that it needs to access a database like Open Weather Maps and retrieve the forecast for the city of Würzburg. This information is then built into the response:

The words

This answer is then displayed to the users.

With a voicebot there is an additional step at the beginning and at the end: The request entered by the users via spoken language is converted into written text by speech recognition software (STT =Speech-to-Text) before it can be analyzed. The answer of the chatbot is converted into spoken language by a text-to-speech software.

Want to know even more about chatbots and voicebots? Then download our infographic Chatbots and Voicebots for Beginners.

Download