会議音声における音声区間検出のためのDeep Neural Networkとクロス適応の検討

中谷彰宏; WANG Longbiao; 甲斐充彦

文献

J-GLOBAL ID：201502213228722246 整理番号：15A0263030

会議音声における音声区間検出のためのDeep Neural Networkとクロス適応の検討

Investigation of Deep Neural Network and Cross-adaptation for Voice Activity Detection in Meeting Speech

出版者サイト複写サービスで全文入手 {{ this.onShowCLink("http://jdream3.com/copy/?sid=JGLOBAL&noSystem=1&documentNoArray=15A0263030&COPY=1") }}
高度な検索・分析はJDreamⅢで {{ this.onShowJLink("http://jdream3.com/lp/jglobal/index.html?docNo=15A0263030&from=J-GLOBAL&jstjournalNo=S0532B") }}

著者 (3件)： , ,
資料名：
巻： 114 号： 365(SP2014 106-126) ページ： 19-24 発行年： 2014年12月08日
JST資料番号： S0532B ISSN： 0913-5685 資料種別：会議録 (C)
記事区分：原著論文発行国：日本 (JPN) 言語：日本語 (JA)

音声区間検出(VAD)では,雑音や残響の影響によりVAD性能が大幅に低下するため,そのような環境に対して頑健なVADシステムの構築が求められている。本研究ではDeep Neural Network(DNN)を用いたVAD手法における性能の改善を目指し,VADモデルの環境適応を提案する。DNNの適応手法として,適応する対象の未知データの自動認識処理を用いた教師なし適応が検討されているが,一般的に教師なし適応は誤りを含む教師信号による学習を行うため,DNNの識別性能が高いほど誤りを忠実に再現してしまう。そのため,誤り傾向の異なる複数の識別システムを用いることで誤りの影響を低減するDNNによるクロス適応が提案されている。本研究ではDNNとは誤り傾向の異なるGMM,SVMの認識結果を適応用の教師ラベルとして用いることによって,適応性能が向上し,雑音と残響に頑健なVADができることを示す。(著者抄録)

, , , , , , ,
, ,

パターン認識 , 人工知能

引用文献 (10件)：

S.E. Bou-Ghazale and K. Assaleh, “A robust endpoint detection of speech for noisy environments with application to automatic speech recognition,” Proc. IEEE ICASSP 2002, pp. 3808-3811, 2002.
J.-L. Shen, J.-W. Hung, and L.-S. Lee, “Robust entropy based endpoint detection for speech recognition in noisy environments,” Proc. ICSLP 1998, 1998.
K. Ishizuka and T. Nakatani, “Study of noise robust voice activity detection based on periodic component to aperiodic component ratio,” Proc. ISCA Tutorial and Research Workshop on Statistical and Perceptual Audition, pp.65-70, 2006.
D. Cournapeau and T. Kawahara, “Evaluation of realtime voice activity detection based on high order statistics,” Proc. Interspeech 2007, pp.2945-2948, 2007.
D. Enqing, L. Guizhong, Z. Yatong, and Z. Xiaodi, “Applying support vector machines to voice activity detection,” Proc. Int. Conf. Signal Process., vol. 2, pp. 1124-1127, 2002.

, , ,

前のページに戻る