Polyphone Disambiguition in Mandarin Chinese with Semi-Supervised Learning <BR>(3 minutes introduction)

Polyphone Disambiguition in Mandarin Chinese with Semi-Supervised Learning
(3 minutes introduction)

Yi Shi (Xmov, China), Congyi Wang (Xmov, China), Yu Chen (Xmov, China), Bin Wang (Xmov, China)

The majority of Chinese characters are monophonic, while a special group of characters, called polyphonic characters, have multiple pronunciations. As a prerequisite of performing speech-related generative tasks, the correct pronunciation must be identified among several candidates. This process is called Polyphone Disambiguation. Although the problem has been well explored with both knowledge-based and learning-based approaches, it remains challenging due to the lack of publicly available labeled datasets and the irregular nature of polyphone in Mandarin Chinese. In this paper, we propose a novel semi-supervised learning (SSL) framework for Mandarin Chinese polyphone disambiguation that can potentially leverage unlimited unlabeled text data. We explore the effect of various proxy labeling strategies including entropy-thresholding and lexicon-based labeling. Qualitative and quantitative experiments demonstrate that our method achieves state-of-the-art performance. In addition, we publish a novel dataset specifically for the polyphone disambiguation task to promote further researches.

Search in Audio

Related Recordings

Improving Polyphone Disambiguation for Mandarin Chinese by Combining Mix-pooling Strategy and Window-based Attention
(longer introduction)

Junjie Li , Zhiyu Zhang , Minchuan Chen , Jun Ma , Shaojun Wang , Jing Xiao

A Neural-Network-Based Approach to Identifying Speakers in Novels
(3 minutes introduction)

Yue Chen , Zhen-Hua Ling , Qing-Feng Liu

InterSpeech 2021

Polyphone Disambiguition in Mandarin Chinese with Semi-Supervised Learning (3 minutes introduction)

Search in Audio

Related Recordings

Improving Polyphone Disambiguation for Mandarin Chinese by Combining Mix-pooling Strategy and Window-based Attention (longer introduction)

A Neural-Network-Based Approach to Identifying Speakers in Novels (3 minutes introduction)

Polyphone Disambiguition in Mandarin Chinese with Semi-Supervised Learning
(3 minutes introduction)

Improving Polyphone Disambiguation for Mandarin Chinese by Combining Mix-pooling Strategy and Window-based Attention
(longer introduction)

A Neural-Network-Based Approach to Identifying Speakers in Novels
(3 minutes introduction)