Skip to main content

Steps for building conversational interfaces

· 5 min read
Yuri Santana

A conversational user interface (CUI) is the way users interact with software through language-understanding interfaces, whether that’s text or voice.

It is formulated to emulate human interaction and that is reflected every step of the way.

Prior to starting building your conversational interface, it’s important to have several aspects pre-defined to make the development and design process more clear and direct.

Before building a conversational interface

You will need to have a clear vision on the following aspects:

  • Type of interaction
  • Is the user going to interact with the app using text? Using voice? or a mix of both? This will depend entirely on the needs your company has and the type of interaction your users need.
  • Goal of interaction
  • Is it transactional or relational? Do you want your users to buy something? This will completely shape the word and interaction design to fit the needs as appropriate.
  • Domain of knowledge
  • Is your app going to be generalist? Would you be able to talk with it about anything or is it going to be a specialist? Focused on your product and specific topics surrounding it? This will not only limit and craft the development process but also help with the conversation design.
  • Who takes the initiative
  • Is it going to be proactive and lead the conversation or reactive and respond only when prompted by the user?
  • Depth of conversation
  • Is it a single shift or a multi-turn conversation with the user?

Steps for creating conversational interfaces

Product design

It should involve the tech side, business knowledge and what your users need.

Create a list of possible functionalities and eliminate, according to your already established goal of interaction, domain of knowledge and depth of conversation. Define those that will end as part of the minimum viable product (MVP).

Conversation structure

This is the point where your team needs to start crafting the happy path your application will follow. You will also need to define the order the information will be presented to the user after a keyword is identified from their input to trigger a search query into your database.

To know more about conversation structure, check out Fonoster’s video on Conversational Interface Design

Interaction design

Much like conversation structure, your team will need to design how to solve each of the presented interactions on the Happy Path, presenting the user with several options or ‘paths’ they can trigger on the application that will take them to their desired outcome with no friction.

For your application to learn, conversational patterns must be used to craft it based on the ideal interaction between the app and the user.

Word design

Picking the exact words to provoke actions in your users is a science by itself. That’s why it’s important to choose specific words and sounds that will make the user reach the goal we want.

We can aid ourselves by asking open, closed or yes or no questions. Users have a better time responding to ‘which country would you like to visit?’ than to ‘where do you want to go’.

Personality design

This is where your team designs the aspects that define your assistant. Your team should be able to identify how the assistant will respond to specific circumstances and how it’s never going to respond.

This is usually where an avatar is created with the demographic characteristics of the assistant and the behavior is defined extensively.

Sound design

It is now time to define the sound of your assistant. Is it going to be an automated voice or a voice actor?

This also includes setting up the sound effects that will be played when opening or closing the assistant.

After building

Prototype and testing

Now that our conversational interface prototype is ready to be released to our users, it’s important to keep on listening to feedback to see which features are working and which ones need to be polished or deleted.

You can begin testing within your own team or community by reading the conversation structure out loud and noticing how they respond to certain choices or paths presented. Remember the goal is to simulate human to human interaction. This is called analog testing .

You can also submit your conversation structure to a platform that will act as a user, allowing you to identify issues and corner cases. This is called automated testing.

Lastly, we have beta testing. It is done by taking a selected group of users and, making the application available for them to get feedback from your own community before releasing it to a bigger audience.


After you have made your application available to your users, one quick way to identify if it’s working or which features are the ones they prefer is by analyzing metrics.

This will allow you to know if the objective the user has set has been met by the application, help you correct interactions and questions and which utterances you should train your interface on.

There are many software applications to know the metrics of both text and voice interfaces, they should give you a clear view of the users, recurrency, functionalities and where your users are abandoning your assistant.