My First Chatbot

Alaa MohyEldin
3 min readOct 3, 2018

--

Through this post, I will try to make the process of building your first chatbot less intimidating, Hope you enjoy the ride!

We can divide the project into 4 consecutive steps but before going through those steps, we better start to decide the scope of our Chatbot. My Chatbot can answer questions about the history of all the La Liga (Premium Spanish Leauge) Clubs also it gives a summary of the cities of each of the La Liga Clubs

First, let’s install the project dependencies, run the following command sudo -H pip install -r requirements.txt

Step 1 — Data Collection

To accomplish this step I crawled each club name, history, page link, a summary of its Wikipedia page, club location, club stadium and finally stadium capacity

Tools: wikipedia python wrapper and Beautifulsoup

import dependencies and get LaLiga page html

By looking the club wikipedia pages up, I found out that the mostly the club history resides either under section History or in the text of the first h3 html tag in the club page

extract info and build the dataset

As you can see not everything has to go very smooth, as for the clubs FC Barcelona, CD Leganés and Valencia CF I had to find the exact history section and hard code it myself. Also, I added a second variation if exists to the name of the club to help when generating the chatbot responses

Step 2— Building The Intent Classifier

One of the most important steps in building a chatbot is to identify the user’s question intent so the bot can reply with the most suitable reply. one of the most convenient ways I found to build intent classifier is using the Rasa NLU pipeline. Building the classifier can be done in 3 steps:

  1. State the intents of your interest for example, great, thankyou, city_question and laliga_question Write out the intent examples as in this and save it under name laliga_intents.md. Note that you can add as many intents as you want
  2. Write the NLU model configuration that you find most suitable for your data — depending on data size, scope, etc. and save it under name laliga_intents_config.yml.This documentation will help you decide the perfect pipeline for your data. Here I used the following pipeline:
  • Message Tokenization
  • POS tagging
  • Glove vectors extracted for each token
  • Concatentate those vectors to form a feature vector for each sentence
  • Build a multiclass SVM model for intent classification
  • CRF model trained on message tokens and POS tags for entity extraction

3. Run the following command in the directory where laliga_intents.md and laliga_intents_config.yml are saved. this will create the following directory ./models/current/nlu with the model bins stored in it.

python -m rasa_nlu.train -c laliga_intents_config.yml — data laliga_intents.md -o models — fixed_model_name nlu — project current — verbose

4. To use the intent classifier use rasa_nlu.model.Interpreter

The output of the previous code block will be as following:

Step 3— Generate Chatbot Responses

The chatbot will depend on two factors to generate its responses

  • User Question Intent type and confidence
  • Extracted entities type of the user question. To do this I used spacy Multiple Language Model to get it python -m spacy download xx_ent_wiki_sm

Step 4— Deploying Chatbot on Facebook Messenger

To deploy the chatbot on Facebook messenger follow the Facebook official documentation. then to test the chatbot run the flask server python app.py then to make your local web server public we will use ngrok ngrok http 8888

All the code for this project is in this repo.

Hope you enjoy it!

--

--

Alaa MohyEldin

Product Manager, tech, and venture excite me, traveling around the world is my ultimate goal. https://www.linkedin.com/in/alaa-mohyeldin-97aa5880