音響サンプルとテキスト補助情報に基づく音響コンテンツ検索

竹内大起; 大石康智; 仁泉大輔; 原田登; 柏野邦夫

文献

J-GLOBAL ID：202202220894933789 整理番号：22A1077600

音響サンプルとテキスト補助情報に基づく音響コンテンツ検索

Audio content retrieval based on audio samples and text auxiliary information.

出版者サイト {{ this.onShowPLink() }} 複写サービスで全文入手 {{ this.onShowCLink("http://jdream3.com/copy/?sid=JGLOBAL&noSystem=1&documentNoArray=22A1077600&COPY=1") }}
高度な検索・分析はJDreamⅢで {{ this.onShowJLink("http://jdream3.com/lp/jglobal/index.html?docNo=22A1077600&from=J-GLOBAL&jstjournalNo=G0381C") }}

著者 (5件)： , , , ,
資料名：
巻： 2022 号：春季ページ： ROMBUNNO.1-1-5 発行年： 2022年02月23日
JST資料番号： G0381C ISSN： 1880-7658 資料種別：会議録 (C)
記事区分：短報発行国：日本 (JPN) 言語：日本語 (JA)

・オンライン上で利用可能な音響信号データの量は日々指数関数的に増大しており,所望のデータを効率的に検索する手法が必要。
・本研究では,音響サンプルと補助テキストからの音響信号の検索という新たなタスクを検討し,それを実現するための手法を提案。
・実験では補助テキストを利用しないベースラインとの比較を行い,提案手法が補助テキストの情報を利用した検索を実現していることを確認。
・今後は,背景音の増減のより高い精度での認識や,人手で差分のテキストが記述されたデータを収集し,それに適応する手法の検討を実施予定。

, , , , , ,
, , ,

音響信号処理 , パターン認識

引用文献 (7件)：

K. Drossos, S. Adavanne, and T. Virtanen, “Clotho: An audio captioning dataset,” in Proc. IEEE Int. Conf. Acoust. Speech Signal Process. (ICASSP), 2019, pp. 736-740.
A.-M. Oncescu, A.S. Koepke, J. Henriques, Z. Akata, and S. Albanie, “Audio retrieval with natural language queries,” in Proc. Interspeech, 2021
A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark, G Krueger, and I. Sutskever, “Learning transferable visual models from natural language supervision,” arXiv preprint arXiv:2103.00020, 2021.
S. Hershey, S. Chaudhuri, D. P. W. Ellis, J. F. Gemmeke, A. Jansen, R. C. Moore, M. Plakal, D. Platt, R. A. Saurous, B. Seybold, M. Slaney, R. J. Weiss, and K Wilson, “CNN architectures for large-scale audio classification,” in Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 2017.
V. Sanh, L. Debut, J. Chaumond, and T. Wolf, “DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter,” arXiv preprint arXiv:1910.01108. 2019.

, ,

前のページに戻る