EMNLP2021文章选读:对话系统方面 chatbot/dialogue system

方浩旷
2023-12-01

A Role-Selected Sharing Network for Joint Machine-Human Chatting Handoff and Service Satisfaction Analysis (Liu2021)
Abstract:
Chatbot is increasingly thriving in different domains, however, because of unexpected discourse complexity and training data sparseness, its potential distrust hatches vital apprehension. Recently, Machine-Human Chatting Handoff (MHCH), predicting chatbot failure and enabling human-algorithm collaboration to enhance chatbot quality, has attracted increasing attention from industry and academia. In this study, we propose a novel model, Role-Selected Sharing Network (RSSN), which integrates both dialogue satisfaction estimation and handoff prediction in one multi-task learning framework. Unlike prior efforts in dialog mining, by utilizing local user satisfaction as a bridge, global satisfaction detector and handoff predictor can effectively exchange critical information. Specifically, we decouple the relation and interaction between the two tasks by the role information after the shared encoder. Extensive experiments on two public datasets demonstrate the effectiveness of our model.
author:
Jiawei Liu, Kaisong Song, Yangyang Kang, Guoxiu He, Zhuoren Jiang, Changlong Sun, Wei Lu, Xiaozhong Liu
challenge: Machine-Human Chatting Handoff
innoation: Role-Selected Sharing Network (RSSN), integrates both dialogue satisfaction estimation and handoff prediction in one multi-task learning framework
dataset: two publicly available Chinese customer service dialogue datasets, namely Clothes and Makeup, collected by Song et al. (2019) from Taobao.
evaluation metrics: Macro F1(Mac. F1) , Accuracy (Acc.) ,Golden Transferwithin Tolerance (GT-T) (Liu et al., 2021)

MRF-Chat: Improving Dialogue with Markov Random Fields (Grover2021)
Abstract:
Recent state-of-the-art approaches in open-domain dialogue include training end-to-end deep-learning models to learn various conversational features like emotional content of response, symbolic transitions of dialogue contexts in a knowledge graph and persona of the agent and the user, among others. While neural models have shown reasonable results, modelling the cognitive processes that humans use when conversing with each other may improve the agent’s quality of responses. A key element of natural conversation is to tailor one’s response such that it accounts for concepts that the speaker and listener may or may not know and the contextual relevance of all prior concepts used in conversation. We show that a rich representation and explicit modeling of these psychological processes can improve predictions made by existing neural network models. In this work, we propose a novel probabilistic approach using Markov Random Fields (MRF) to augment existing deep-learning methods for improved next utterance prediction. Using human and automatic evaluations, we show that our augmentation approach significantly improves the performance of existing state-of-the-art retrieval models for open-domain conversational agents.
author:
Ishaan Grover, Matthew Huggins, Cynthia Breazeal, Hae Won Park
challenge:
understand concepts that the speaker and listener may or may not know and the contextual relevance of all prior concepts used in conversation.
innovation:using Markov Random Fields (MRF),first model aims to contextual relevance.
dataset: KV Memory on Persona-Chat, Poly-encoder: pre-trained on the ConvAI2 dataset and evaluated on validation (1,009 conversations) and test sets (980 conversations) from the BlendedSkillTalk dataset (BST) (Smith et al., 2020).
evaluation metrics:

  1. Human Evaluation. Workers were asked which response is better, based on the conversation four-point scale of “Response 1 is much better”,“Response 1 is slightly better”, “Response 2is slightly better”, and “Response 2 is much better”.
  2. Automatic Evaluation. Hits@1 and Mean Reciprocal Rank(MRR)

Simulated Chats for Building Dialog Systems: Learning to Generate Conversations from Instructions(Mohapatra2021)
In this paper, we present a data creation strategy that uses the pre-trained language model, GPT2, to simulate the interaction between crowd workers by creating a user bot and an agent bot. We train the simulators using a smaller percentage of actual crowd-generated conversations and their corresponding instructions.利用预训练模型生成模拟对话,再训练模型

Retrieve, Discriminate and Rewrite: A Simple and Effective Framework for Obtaining Affective Response in Retrieval-Based Chatbots(Lu2021)
Existing works in retrieval-based chatbots are based on Retrieve-and-Rerank framework, which have a common problem of satisfying affect label at the expense of response quality.
基于检索的机器人的问题:牺牲相应时间来获取情感标签
解决方法:提出了一个框架To address this problem, we propose a simple and effective Retrieve-Discriminate-Rewrite framework. The framework replaces the reranking mechanism with a new discriminate-and-rewrite mechanism, which predicts the affect label of the retrieved high-quality response via discrimination module and further rewrites the affect unsatisfied response via rewriting module.在discrimination模块生成回复的情感标签,并且在重写模块进行更正。
同时标注了一个情感豆瓣语料数据集进行训练。labels for 1,400 dialogues with a total of 10,712 utterances.

Efficient Dialogue Complementary Policy Learning via Deep Q-network Policy and Episodic Memory Policy
强化学习,Q-Learing,对话策略学习

Multi-Modal Open-Domain Dialogue
结合图像,声音信息

Neural Path Hunter: Reducing Hallucination in Dialogue Systems via Path Grounding(Dziri)
预训练模型的问题:often generate factually incorrect statements
关注目标:improving the faithfulness – and thus reduce hallucination – of Neural Dialogue Systems to known facts supplied by a Knowledge Graph (KG)
提出了:Neural Path Hunter which follows a generate-then-refine strategy whereby a generated response is amended using the k-hop subgraph of a KG.

Low-Resource Dialogue Summarization with Domain-Agnostic Multi-Source Pretraining
Abstract: With the rapid increase in the volume of dialogue data from daily life, there is a growing demand for dialogue summarization. Unfortunately, training a large summarization model is generally infeasible due to the inadequacy of dialogue data with annotated summaries. Most existing works for low-resource dialogue summarization directly pretrain models in other domains, e.g., the news domain, but they generally neglect the huge difference between dialogues and conventional articles. To bridge the gap between out-of-domain pretraining and in-domain fine-tuning, in this work, we propose a multi-source pretraining paradigm to better leverage the external summary data. Specifically, we exploit large-scale in-domain non-summary data to separately pretrain the dialogue encoder and the summary decoder. The combined encoder-decoder model is then pretrained on the out-of-domain summary data using adversarial critics, aiming to facilitate domain-agnostic summarization. The experimental results on two public datasets show that with only limited training data, our approach achieves competitive performance and generalizes well in different dialogue scenarios.
Author:
Yicheng Zou, Bolin Zhu, Xingwu Hu, Tao Gui, Qi Zhang
challenge: inadequacy of dialogue data with annotated summaries
innovation: multi-source pretraining paradigm to better leverage the external summary data,
exploit large-scale in-domain non-summary data to separately pretrain.

类似的文章:
MultiDoc2Dial: Modeling Dialogues Grounded in Multiple Documents
Song Feng, Siva Sankalp Patel, Hui Wan and Sachindra Joshi

CoLV: A Collaborative Latent Variable Model for Knowledge-Grounded Dialogue Generation(Zhan2021)
Abstract: Knowledge-grounded dialogue generation has achieved promising performance with the engagement of external knowledge sources. Typical approaches towards this task usually perform relatively independent two sub-tasks, i.e., knowledge selection and knowledge-aware response generation. In this paper, in order to improve the diversity of both knowledge selection and knowledge-aware response generation, we propose a collaborative latent variable (CoLV) model to integrate these two aspects simultaneously in separate yet collaborative latent spaces, so as to capture the inherent correlation between knowledge selection and response generation. During generation, our proposed model firstly draws knowledge candidate from the latent space conditioned on the dialogue context, and then samples a response from another collaborative latent space conditioned on both the context and the selected knowledge. Experimental results on two widely-used knowledge-grounded dialogue datasets show that our model outperforms previous methods on both knowledge selection and response generation.
Author:
Haolan Zhan, Lei Shen, Hongshen Chen, Hainan Zhang
challenge: improve the diversity of both knowledge selection and knowledge-aware response generation
innovation: collaborative latent variable (CoLV), firstly draws knowledge candidate from the latent space conditioned on the dialogue context, and then samples a response from another collaborative latent space conditioned on both the context and the selected knowledge.
dataset: Wizard of Wikipedia (Dinan et al., 2019) (WoW) and Holl-E (Moghe et al., 2018).
evaluation metrics: Accuracy, Perplexity,ROUGE-1, ROUGE-2 and Distinct-2; Human Evaluation Results:Win/Lose/Tie

A Three-Stage Learning Framework for Low-Resource Knowledge-Grounded Dialogue Generation
Abstract:
Neural conversation models have shown great potentials towards generating fluent and informative responses by introducing external background knowledge. Nevertheless, it is laborious to construct such knowledge-grounded dialogues, and existing models usually perform poorly when transfer to new domains with limited training samples. Therefore, building a knowledge-grounded dialogue system under the low-resource setting is a still crucial issue. In this paper, we propose a novel three-stage learning framework based on weakly supervised learning which benefits from large scale ungrounded dialogues and unstructured knowledge base. To better cooperate with this framework, we devise a variant of Transformer with decoupled decoder which facilitates the disentangled learning of response generation and knowledge incorporation. Evaluation results on two benchmarks indicate that our approach can outperform other state-of-the-art methods with less training data, and even in zero-resource scenario, our approach still performs well.
Author:
Shilei Liu, Xiaofeng Zhao, Bochao Li, Feiliang Ren, Longhui Zhang and Shujuan Yin
challenge:
building a knowledge-grounded dialogue system under the low-resource setting
innovation:
novel three-stage learning framework based on weakly supervised learning
first stage:supervised learning to pre-train dialogue-related paramete
second stage: match a set of pseudo-knowledge for each ungrounded dialogue to construct a lower quality knowledge-grounded dialogue dataset
third stage:the trained model will be fine-tuned on the target low-resource dataset

MultiDoc2Dial: Modeling Dialogues Grounded in Multiple Documents
Song Feng, Siva Sankalp Patel, Hui Wan and Sachindra Joshi
Abstract:
We propose MultiDoc2Dial, a new task and dataset on modeling goal-oriented dialogues grounded in multiple documents. Most previous works treat document-grounded dialogue modeling as a machine reading comprehension task based on a single given document or passage. In this work, we aim to address more realistic scenarios where a goal-oriented information-seeking conversation involves multiple topics, and hence is grounded on different documents. To facilitate such a task, we introduce a new dataset that contains dialogues grounded in multiple documents from four different domains. We also explore modeling the dialogue-based and document-based context in the dataset. We present strong baseline approaches and various experimental results, aiming to support further research efforts on such a task.

More is Better: Enhancing Open-Domain Dialogue Generation via Multi-Source Heterogeneous Knowledge
Sixing Wu, Ying Li, Minghui Wang, Dawei Zhang, Yang Zhou, Zhonghai Wu
Abstract:
Despite achieving remarkable performance, previous knowledge-enhanced works usually only use a single-source homogeneous knowledge base of limited knowledge coverage. Thus, they often degenerate into traditional methods because not all dialogues can be linked with knowledge entries. This paper proposes a novel dialogue generation model, MSKE-Dialog, to solve this issue with three unique advantages: (1) Rather than only one, MSKE-Dialog can simultaneously leverage multiple heterogeneous knowledge sources (it includes but is not limited to commonsense knowledge facts, text knowledge, infobox knowledge) to improve the knowledge coverage; (2) To avoid the topic conflict among the context and different knowledge sources, we propose a Multi-Reference Selection to better select context/knowledge; (3) We propose a Multi-Reference Generation to generate informative responses by referring to multiple generation references at the same time. Extensive evaluations on a Chinese dataset show the superior performance of this work against various state-of-the-art approaches. To our best knowledge, this work is the first to use the multi-source heterogeneous knowledge in the open-domain knowledge-enhanced dialogue generation.

problem of previous works: use a single-source homogeneous knowledge base of limited knowledge coverage
innovation:
MSKE-Dialog:

  1. leverage multiple heterogeneous knowledge sources
  2. better select context/knowledge, to avoid the topic conflict
  3. referring to multiple generation references
    dataset:
    three open-released Chinese Weibo corpora (Shang et al., 2015; Ke et al.,2018; Cai et al., 2019)

Controllable Neural Dialogue Summarization with Personal Named Entity Planning:Summarization

Domain-Lifelong Learning for Dialogue State Tracking via Knowledge Preservation Networks:DST

Graph Based Network with Contextualized Representations of Turns in Dialogue
Bongseok Lee, Yong Suk Choi
Abstract:
Dialogue-based relation extraction (RE) aims to extract relation(s) between two arguments that appear in a dialogue. Because dialogues have the characteristics of high personal pronoun occurrences and low information density, and since most relational facts in dialogues are not supported by any single sentence, dialogue-based relation extraction requires a comprehensive understanding of dialogue. In this paper, we propose the TUrn COntext awaRE Graph Convolutional Network (TUCORE-GCN) modeled by paying attention to the way people understand dialogues. In addition, we propose a novel approach which treats the task of emotion recognition in conversations (ERC) as a dialogue-based RE. Experiments on a dialogue-based RE dataset and three ERC datasets demonstrate that our model is very effective in various dialogue-based natural language understanding tasks. In these experiments, TUCORE-GCN outperforms the state-of-the-art models on most of the benchmark datasets. Our code is available at this https URL.
challenge:
extract relation(s) between two arguments that appear in a dialogue: high personal pronoun occurrences and low information density
innovation:
Graph Convolutional Network ,based on task of emotion recognition in conversations

Knowledge Enhanced Fine-Tuning for Better Handling Unseen Entities in Dialogue Generation
Leyang Cui, Yu Wu, Shujie Liu, Yue Zhang
Abstract:
Although pre-training models have achieved great success in dialogue generation, their performance drops dramatically when the input contains an entity that does not appear in pre-training and fine-tuning datasets (unseen entity). To address this issue, existing methods leverage an external knowledge base to generate appropriate responses. In real-world scenario, the entity may not be included by the knowledge base or suffer from the precision of knowledge retrieval. To deal with this problem, instead of introducing knowledge base as the input, we force the model to learn a better semantic representation by predicting the information in the knowledge base, only based on the input context. Specifically, with the help of a knowledge base, we introduce two auxiliary training objectives: 1) Interpret Masked Word, which conjectures the meaning of the masked entity given the context; 2) Hypernym Generation, which predicts the hypernym of the entity based on the context. Experiment results on two dialogue corpus verify the effectiveness of our methods under both knowledge available and unavailable settings.
challenge:
performance drops dramatically when the input contains an entity that does not appear in pre-training and fine-tuning datasets
innovation:
learn a better semantic representation by predicting the information in the knowledge base, only based on the input context

  1. Interpret Masked Word: predicts the word’s definition based on the context, where the definition is obtained from a knowledge base
  2. Hypernym Generation: predicts the corresponding hypernym of the word given by WordNet.
    注:我觉得这篇有点怪,应该是想找到一个未出现的词->词的解释/上位词的一般规律,有点类似从词语和先前对话猜测词义,不知道是否真的有意义。

Adaptive Bridge between Training and Inference for Dialogue Generation
…待续

 类似资料: