LI Sheng

リシェン | LI Sheng

Affiliation and department：
Job title： Researcher
Homepage URL (2)： https://ast-astrec.nict.go.jp/member/sheng-li/index-modern-jp.html , https://ast-astrec.nict.go.jp/member/sheng-li/index.html

Research field (1)： Perceptual information processing

Research keywords (5)： large language model (speech, text) , security-aware speech processing , multimodal speech process , computer assisted language learning , Speech Recognition/translation

Research theme for competitive and other funds (10)：

2023 - 2028 意図を的確に伝える音声対話翻訳の基盤技術の創出
2023 - 2026 M3OLR: Towards Effective Multilingual, Multimodal and Multitask Oriental Low-resourced Language Speech Recognition
2023 - 2024 Spoof Detection for Automatic Speaker Verification
2024 - enhancing large language model
2022 - 2024 Bridging Eurasia from Sea -- Multilingual Speech Recognition for Maritime Silkroad

Show all

Papers (113)：

Hay Mar Soe Naing, Win Pa Pa, Sheng Li. Parallel and Limited Data Voice Conversions on Myanmar Language Speech for Spoofed Detection. ACM Multimedia Asia 2024 Workshops. 2024
Qingqing Zhang, Lei Luo, Simin Xu, Yongjing Chen, Chuang Li, Sheng Li, Ruili Wang. LaMuCo: Large-Scale Multilingual Conversation Speech Recognition Challenge. ACM Multimedia Asia 2024 workshop. 2024
Chao Tan, Sheng Li, Yang Cao, Zhao Ren, Tanja Schultz. Investigating Effective Speaker Property Privacy Protection in Federated Learning for Speech Emotion Recognition. ACM Multimedia Asia 2024. 2024
Chin-Yuen Kwok, Sheng Li, Jia-Qi Yip, Eng-Siong Chng. Low-resource Language Adaptation with Ensemble of PEFT Approaches. Asia-Pacific Signal and Information Processing Association Annual Summit and Conference 2024. 2024
Sheng Li, Yuka Ko, Akinori Ito. LLM as decoder: Investigating Lattice-based Speech Recognition Hypotheses Rescoring Using LLM. Asia-Pacific Signal and Information Processing Association Annual Summit and Conference 2024. 2024

ｍore...

MISC (18)：

Yuka Ko, Sheng Li, Chao-Han Huck Yang, Tatsuya Kawahara. Benchmarking Japanese Speech Recognition on ASR-LLM Setups with Multi-Pass Augmented Generative Error Correction. arXiv. 2024
Lele Zheng, Yang Cao, Renhe Jiang, Kenjiro Taura, Yulong Shen, Sheng Li, Masatoshi Yoshikawa. Extracting Spatiotemporal Data from Gradients with Large Language Models. arXiv. 2024
Chao Tan, Sheng Li, Yang Cao, Zhao Ren, Tanja Schultz. Investigating Effective Speaker Property Privacy Protection in Federated Learning for Speech Emotion Recognition. arXiv. 2024
Lele Zheng, Yang Cao, Renhe Jiang, Kenjiro Taura, Yulong Shen, Sheng Li, Masatoshi Yoshikawa. Enhancing Privacy of Spatiotemporal Federated Learning against Gradient Inversion Attacks. arXiv. 2024
Wangjin Zhou, Zhengdong Yang, Chenhui Chu, Sheng Li, Raj Dabre, Yi Zhao, Tatsuya Kawahara. MOS-FAD: Improving Fake Audio Detection Via Automatic Mean Opinion Score Prediction. 2024

ｍore...

Patents (7)：

推論器および推論器の学習方法
推論器、推論プログラムおよび学習方法
言語識別モデルの訓練方法及び装置、並びにそのためのコンピュータプログラム
識別器、学習済モデル、学習方法
音声認識システム、音声認識方法、学習済モデル

ｍore...

Books (4)：

Voices of the Himalayas: Investigation of Speech Recognition Technology for the Tibetan Language
2023 ISBN:9784904020289
Bridging Eurasia: Multilingual Speech Recognition for Silkroad
2023 ISBN:9784904020296
Phantom in the Opera: The Vulnerabilities of Speech-based Artificial Intelligence Systems
2022 ISBN:9784904020265
Automatic speech recognition: Speech-to-Speech Translation
Springer Singapore 2020

Lectures and oral presentations (58)：

大規模言語モデルの統合による音声認識システムの改善
(NICT Open House 2024 2024)
Combining Large Language Model with Speech Recognition System in Low-resource Settings
(NLP2024 2024)
Investigating effective methods for combining large language model with speech recognition system
(Acoustical Society of Japan 151st (Spring 2024) meeting 2024)
Hierarchical Softmax for End-to-End Low-resource Multilingual Speech Recognition
(ICT-innovation 2023 (Kyoto Univ.) 2024)
Self-Supervised Learning MOS Prediction with Listener Enhancement
(VoiceMOS mini workshop 2023)

ｍore...

Works (8)：

HSoftmax: Hierachical Softmax (https://github.com/Derek-Gong/hsoftmax/)
Zhuo Gong, Qianying Liu, Sheng Li, Zhengdong Yang, Yuhang Yang 2020 -
Julius for speech foundation models
very deep residual time-delay neural network (TDNN) with LFMMI objective implemented with MS-CNTK
online speech recognition module for Erica the human robot
Julius decoder with EESEN CTC acoustic model

ｍore...

Education (3)：

2012 - 2016 Kyoto University Graduate School Ph.D Informatic Science
2007 - 2009 Nanjing University Joint Program of Chinese Academy of Sciences, Chinese University of Hong Kong and Nanjing University M.E
2002 - 2006 Nanjing University B.S Computer Science

Professional career (1)：

Ph.D Informatic Science (Kyoto University)

Work history (8)：

2020 - 現在 National Institute of Information and Communications Technology (NICT) Advanced Speech Technology Laboratory (ASTL) Tenure-track Researcher
2024/02 - 2024/03 Nanyang Technological University visiting researcher
2021/12 - 2023/03 Kyoto University master course advisor
2019/04 - 2019/05 Oxford University Computer science department visiting researcher
2017 - 2019 National Institute of Information and Communications Technology (NICT) Advanced Speech Technology Laboratory (ASTL) researcher (hired by Tokyo Olympic2020 project)

Show all

Committee career (13)：

- 2026 APSIPA Speech, Language, and Audio (SLA) Technical Committee (till 2026)
2024/06 - 2024/12 Publicity Chair of ACM Multimedia Asia 2024
2024/12 - Co-organizing ACM Multimedia Asia 2024 workshop: Multimodal, Multilingual and Multitask Modeling Technologies for Oriental Languages (M3Oriental) Co-organizer
2024/07 - Session Chair of DASFAA2024
2024/04 - 2024/04 Session Chair of IEEE-ICASSP2024

Show all

Awards (22)：

2023/12 - ICASSP2024 ICMC-ASR (In-Car Multi-Channel Automatic Speech Recognition) Challenge top2 in one track
2023/12 - 1st place in one track in ASRU2023 special session: VoiceMOS challenge
2023/05 - IEEE signal processing society IEEE-SPS grant for IEEE-ICASSP2023 oral presentation (Co-supervised PhD student Qianying Liu)
2022 - 1st place in 6 indexes (total 16) of Main/OOD tracks in INTERSPEECH2022 special session: VoiceMOS challenge
2021/12 - Oriental language recognition challenge 2021 3rd/4th place in constrained/unconstrained resource multilingual ASR tracks of OLR2021 challenge

Show all

Association Membership(s) (7)：

APNNS (Asia Pacific Neural Network Society) , APSIPA (Asia Pacific Signal and Information Processing Association) , SIG-CSLP (Chinese Spoken Language Processing) , ASJ (日本音響学会) , ISCA (International Speech Communication Association) , IEEE/IEEE-SPS , ACM (Association for Computing Machinery)

※ Researcher’s information displayed in J-GLOBAL is based on the information registered in researchmap. For details, see here.

Return to Previous Page