ブースティングによるマルチモーダル音声区間検出の結果統合

竹内伸一; 羽柴隆志; 田村哲嗣; 速水悟

文献

J-GLOBAL ID：201002271970657076 整理番号：10A0772044

ブースティングによるマルチモーダル音声区間検出の結果統合

Decision Fusion using Boosting Method for Multi-Modal Voice Activity Detection

出版者サイト複写サービスで全文入手 {{ this.onShowCLink("http://jdream3.com/copy/?sid=JGLOBAL&noSystem=1&documentNoArray=10A0772044&COPY=1") }}
高度な検索・分析はJDreamⅢで {{ this.onShowJLink("http://jdream3.com/lp/jglobal/index.html?docNo=10A0772044&from=J-GLOBAL&jstjournalNo=S0532B") }}

著者 (4件)： , , ,
資料名：
巻： 110 号： 81(SP2010 22-34) ページ： 25-30 発行年： 2010年06月10日
JST資料番号： S0532B ISSN： 0913-5685 資料種別：会議録 (C)
記事区分：原著論文発行国：日本 (JPN) 言語：日本語 (JA)

音声認識の前段階として用いられる音声区間検出技術(Voice Activity Detection,VAD)には高い雑音区間除去能力が求められる。耐雑音性を向上させる手法のひとつとしてマルチモーダルVADがあり,音声のノイズに影響を受けない画像情報を用いることで精度向上が期待できる。本報告では各モダリティから得られた結果をブースティングによって統合する,マルチモーダルVADの結果統合について検討を行う。AdaBoostは機械学習の手法のひとつであり,複数の弱識別器を統合することで強識別器を作成する。学習によって各学習器毎に求められた重みを考慮して2クラス分類が行われる。提案手法では音声/画像特徴量を識別器として学習を行い,各特徴量から得られた結果を用いた重みつき多数決で結果統合を行う。実験結果から,雑音重畳環境下では画像特徴量に重みを付与した多数決による結果統合が有効であることがわかった。(著者抄録)

, , , , , , ,
, ,

パターン認識

引用文献 (9件)：

FUJIMOTO, M. Study of integration of statistical model-based voice activity detection and noise suppression. Proceedings of Interspeech, 2008. 2008, 2008-2011
ASANO, F. Detection and separation of speech event using audio and video information fusion and its application to robust speech interface. EURASIP Journal on Applied Signal Processing. 2004, 2004, 1727-1738
BUTKO, T. Fusion of audio and video modalities for detection of acoustic events. Proceedings of Interspeech, 2008. 2008, 123-126
ALMAJAI, I. Using audio-visual features for robust voice activity detection in clean and noisy speech. Proceedings of EUSIPCO2008. 2008, 123-126
TAKEUCHI, S. Voice activity detection based on fusion of audio and visual information. Proceedings. of AVSP, 2009-09. 2009, 151-154

, , ,

前のページに戻る