|Zhou Yu, Ziyu Xu, Alan W Black and Alexander Rudnicky|
We propose a set of generic conversational strategies to handle possible system breakdowns in non-task-oriented dialog systems. We also design policies to select these strategies according to dialog context. We combine expert knowledge and the statistical findings derived from data in designing these policies. The policy learned via reinforcement learning outperforms the random selection policy and the locally greedy policy in both simulated and real-world settings. In addition, we propose three metrics for conversation quality evaluation which consider both the local and global quality of the conversation.