Acoustic Event Detection in Speech Overlapping Scenarios Based on High-Resolution Spectral Input and Deep Learning

ESPI Miquel; FUJIMOTO Masakiyo; NAKATANI Tomohiro

Art

J-GLOBAL ID：201502212365671637 Reference number：15A1163768

Acoustic Event Detection in Speech Overlapping Scenarios Based on High-Resolution Spectral Input and Deep Learning

高解像度スペクトル入力とディープラーニングに基づいた発話重複シナリオにおける聴覚イベント検出

Publisher site Copy service {{ this.onShowCLink("http://jdream3.com/copy/?sid=JGLOBAL&noSystem=1&documentNoArray=15A1163768&COPY=1") }}
Access JDreamⅢ for advanced search and analysis. {{ this.onShowJLink("http://jdream3.com/lp/jglobal/index.html?docNo=15A1163768&from=J-GLOBAL&jstjournalNo=U0469A") }}

Author (3)： , ,
Material：
Volume： E98.D Issue： 10 Page： 1799-1807 (J-STAGE) Publication year： 2015
JST Material Number： U0469A ISSN： 1745-1361 Document type： Article
Article type：原著論文 Country of issue： Japan (JPN) Language： ENGLISH (EN)

, , , , , , , , , , , ,
, , , ,

Pattern recognition , Artificial intelligence

Reference (30)：

[1] D. Mostefa, N. Moreau, K. Choukri, G. Potamianos, S. Chu, A. Tyagi, J. Casas, J. Turmo, L. Cristoforetti, F. Tobia, A. Pnevmatikakis, V. Mylonakis, F. Talantzis, S. Burger, R. Stiefelhagen, K. Bernardin, and C. Rochet, “The CHIL audiovisual corpus for lecture and meeting analysis inside smart rooms,” Language Resources and Evaluation, vol.41, no.3-4, pp.389-407, 2007.
[2] D. Giannoulis, E. Benetos, D. Stowell, M. Rossignol, M. Lagrange, and M. Plumbley, “Detection and classification of acoustic scenes and events: An IEEE AASP challenge,” 2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp.1-4, 2013.
[3] K. Imoto, S. Shimauchi, H. Uematsu, and H. Ohmuro, “User activity estimation method based on probabilistic generative model of acoustic event sequence with user activity and its subordinate categories,” INTERSPEECH'2013, pp.2609-2613, 2013.
[4] C. Canton-Ferrer, T. Butko, C. Segura, X. Giro, C. Nadeu, J. Hernando, and J. Casas, “Audiovisual event detection towards scene understanding,” IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPR), pp.81-88, 2009.
[5] T. Hori, S. Araki, T. Yoshioka, M. Fujimoto, S. Watanabe, T. Oba, A. Ogawa, K. Otsuka, D. Mikami, K. Kinoshita, T. Nakatani, A. Nakamura, and J. Yamato, “Low-latency real-time meeting recognition and understanding using distant microphones and omni-directional camera,” IEEE Trans. Audio, Speech, Language Process., vol.20, no.2, pp.499-513, 2012.

ｍore...

, , , , , ,

Return to Previous Page