文档详情

ABSTRACT Automatic Summarization of English Broadcast News Speech.pdf

发布:2017-04-10约2.92万字共6页下载文档
文本预览下载声明
Automatic Summarization of English Broadcast News Speech Chiori Hori?, Sadaoki Furui?, Rob Malkin?, Hua Yu? and Alex Waibel? ?Department of Computer Science, Tokyo Institute of Technology, 2-12-1, O-okayama, Meguro-ku, Tokyo, 152-8552 Japan {chiori,furui}@furui.cs.titech.ac.jp ?Interactive Systems Labs, Carnegie Mellon University, Pittsburgh, PA 15213, USA {malkin,hua,ahw}@ ABSTRACT This paper proposes an automatic speech summarization technique for English. In our proposed method, a set of words maximizing a summarization score indicating appropriateness of summariza- tion is extracted from automatically transcribed speech and con- catenated to create a summary. The extraction process is performed using a Dynamic Programming (DP) technique according to a tar- get compression ratio. In this paper, English broadcast news speech transcribed using a speech recognizer is automatically summarized. In order to apply our method, originally proposed for Japanese, to English, the model of estimating word concatenation probabilities based on a dependency structure in the original speech given by a Stochastic Dependency Context Free Grammar (SDCFG) is mod- ified. A summarization method for multiple utterances using two- level DP technique is also proposed. The automatically summa- rized sentences are evaluated by a summarization accuracy based on the comparison with the manual summarization of correctly transcribed speech by human subjects. Experimental results show that our proposed method effectively extracts relatively important information and remove redundant and irrelevant information from English news speech. Keywords Speech summarization, Summarization scores, Two-level Dynamic Programming, Stochastic Dependency Context Free Grammar, Sum- marization accuracy 1. INTRODUCTION Recently, large-vocabulary continuous-speech recognition (LVCSR) technology has made significant advancement. Real time systems can now achieve word accuracy of 90 % and above for speech dic- tated
显示全部
相似文档