Deep RL Conversational Chatbot

Reza, Md Salim

View/Open

Project Report (912.0Kb)

Date

2024-05-07

Author

Reza, Md Salim

Metadata

Show full item record

Abstract

Machine translation in natural language has made remarkable strides, particularly in language translation. Despite extensive global research, the generation of dialogue remains challenging, and this area continues to captivate researchers. Current dialogue generation models often yield inconsistent and generic answers, resulting in less engaging user conversations. This study explores recent methodological proposals by NLP researchers, focusing on reinforcement learning principles. A novel reinforcement learning model for dialogue generation shows promise in generating conversational agent responses. However, its short memory can impact future results and predict utterances without considering the overall dialogue direction, affecting coherence and interest. An updated seq2seq model, employing reinforcement learning with the policy gradient method, encourages agents to participate in more engaging dialogues. The impact of reward functions, such as semantic coherence, controls conversation quality, and model evaluation includes quantitative measurements like n-gram iterations. The goal of this research is to develop an open-domain AI conversational chatbot using reinforcement learning. By applying recent research models, the aim is to find an optimal hybrid solution for generating more interesting system utterances. Chatbots, integral in virtual assistants and messaging apps, are growing in popularity, making this project timely and relevant. The system comprises three models: the Natural Language Understanding (NLU) module, the Dialogue Manager (DM) module, and the Reinforcement Learning algorithm. The NLU module analyzes user utterances, employing an ensemble machine learning strategy with bi-directional LSTM for intent classification and entity extraction. The NLU output serves as input for the DM module, where reinforcement learning guides decision-making to maximize rewards based on current and past status. The DM module aims to achieve human-level learning performance swiftly and facilitate engaging, meaningful, and logical discussions.

URI

http://dspace.uiu.ac.bd/handle/52243/2983

Collections

M.Sc Thesis/Project [154]