Rchr
J-GLOBAL ID:201801014603986279
Update date: Oct. 24, 2024
LI Sheng
リ シェン | LI Sheng
Affiliation and department:
Job title:
Researcher
Homepage URL (2):
https://ast-astrec.nict.go.jp/member/sheng-li/index-modern-jp.html
,
https://ast-astrec.nict.go.jp/member/sheng-li/index.html
Research field (1):
Perceptual information processing
Research keywords (5):
large language model (speech, text)
, security-aware speech processing
, multimodal speech process
, computer assisted language learning
, Speech Recognition/translation
Research theme for competitive and other funds (10):
- 2023 - 2028 意図を的確に伝える音声対話翻訳の基盤技術の創出
- 2023 - 2026 M3OLR: Towards Effective Multilingual, Multimodal and Multitask Oriental Low-resourced Language Speech Recognition
- 2023 - 2024 Spoof Detection for Automatic Speaker Verification
- 2024 - enhancing large language model
- 2022 - 2024 Bridging Eurasia from Sea -- Multilingual Speech Recognition for Maritime Silkroad
- 2021 - 2023 Phantom in the Opera -- the Vulnerabilities of Speech Interface for Robotic Dialogue System
- 2020 - 2022 Advanced Multilingual End-to-End Speech Recognition
- 2020 - 2022 Bridging Eurasia -- Multilingual Speech Recognition for Silkroad
- 2020 - 2021 Speaker De-identification with Provable Privacy in Speech Data Release
- 2019 - 2021 Next generation multilingual End-to-End speech recognition (from G30 to G200)
Show all
Papers (113):
-
Hay Mar Soe Naing, Win Pa Pa, Sheng Li. Parallel and Limited Data Voice Conversions on Myanmar Language Speech for Spoofed Detection. ACM Multimedia Asia 2024 Workshops. 2024
-
Qingqing Zhang, Lei Luo, Simin Xu, Yongjing Chen, Chuang Li, Sheng Li, Ruili Wang. LaMuCo: Large-Scale Multilingual Conversation Speech Recognition Challenge. ACM Multimedia Asia 2024 workshop. 2024
-
Chao Tan, Sheng Li, Yang Cao, Zhao Ren, Tanja Schultz. Investigating Effective Speaker Property Privacy Protection in Federated Learning for Speech Emotion Recognition. ACM Multimedia Asia 2024. 2024
-
Chin-Yuen Kwok, Sheng Li, Jia-Qi Yip, Eng-Siong Chng. Low-resource Language Adaptation with Ensemble of PEFT Approaches. Asia-Pacific Signal and Information Processing Association Annual Summit and Conference 2024. 2024
-
Sheng Li, Yuka Ko, Akinori Ito. LLM as decoder: Investigating Lattice-based Speech Recognition Hypotheses Rescoring Using LLM. Asia-Pacific Signal and Information Processing Association Annual Summit and Conference 2024. 2024
more...
MISC (18):
-
Yuka Ko, Sheng Li, Chao-Han Huck Yang, Tatsuya Kawahara. Benchmarking Japanese Speech Recognition on ASR-LLM Setups with Multi-Pass Augmented Generative Error Correction. arXiv. 2024
-
Lele Zheng, Yang Cao, Renhe Jiang, Kenjiro Taura, Yulong Shen, Sheng Li, Masatoshi Yoshikawa. Extracting Spatiotemporal Data from Gradients with Large Language Models. arXiv. 2024
-
Chao Tan, Sheng Li, Yang Cao, Zhao Ren, Tanja Schultz. Investigating Effective Speaker Property Privacy Protection in Federated Learning for Speech Emotion Recognition. arXiv. 2024
-
Lele Zheng, Yang Cao, Renhe Jiang, Kenjiro Taura, Yulong Shen, Sheng Li, Masatoshi Yoshikawa. Enhancing Privacy of Spatiotemporal Federated Learning against Gradient Inversion Attacks. arXiv. 2024
-
Wangjin Zhou, Zhengdong Yang, Chenhui Chu, Sheng Li, Raj Dabre, Yi Zhao, Tatsuya Kawahara. MOS-FAD: Improving Fake Audio Detection Via Automatic Mean Opinion Score Prediction. 2024
more...
Patents (7):
-
推論器および推論器の学習方法
-
推論器、推論プログラムおよび学習方法
-
言語識別モデルの訓練方法及び装置、並びにそのためのコンピュータプログラム
-
識別器、学習済モデル、学習方法
-
音声認識システム、音声認識方法、学習済モデル
more...
Books (4):
-
Voices of the Himalayas: Investigation of Speech Recognition Technology for the Tibetan Language
2023 ISBN:9784904020289
-
Bridging Eurasia: Multilingual Speech Recognition for Silkroad
2023 ISBN:9784904020296
-
Phantom in the Opera: The Vulnerabilities of Speech-based Artificial Intelligence Systems
2022 ISBN:9784904020265
-
Automatic speech recognition: Speech-to-Speech Translation
Springer Singapore 2020
Lectures and oral presentations (58):
-
大規模言語モデルの統合による音声認識システムの改善
(NICT Open House 2024 2024)
-
Combining Large Language Model with Speech Recognition System in Low-resource Settings
(NLP2024 2024)
-
Investigating effective methods for combining large language model with speech recognition system
(Acoustical Society of Japan 151st (Spring 2024) meeting 2024)
-
Hierarchical Softmax for End-to-End Low-resource Multilingual Speech Recognition
(ICT-innovation 2023 (Kyoto Univ.) 2024)
-
Self-Supervised Learning MOS Prediction with Listener Enhancement
(VoiceMOS mini workshop 2023)
more...
Works (8):
-
HSoftmax: Hierachical Softmax (https://github.com/Derek-Gong/hsoftmax/)
Zhuo Gong, Qianying Liu, Sheng Li, Zhengdong Yang, Yuhang Yang 2020 -
-
Julius for speech foundation models
-
very deep residual time-delay neural network (TDNN) with LFMMI objective implemented with MS-CNTK
-
online speech recognition module for Erica the human robot
-
Julius decoder with EESEN CTC acoustic model
more...
Education (3):
- 2012 - 2016 Kyoto University Graduate School Ph.D Informatic Science
- 2007 - 2009 Nanjing University Joint Program of Chinese Academy of Sciences, Chinese University of Hong Kong and Nanjing University M.E
- 2002 - 2006 Nanjing University B.S Computer Science
Professional career (1):
- Ph.D Informatic Science (Kyoto University)
Work history (8):
- 2020 - 現在 National Institute of Information and Communications Technology (NICT) Advanced Speech Technology Laboratory (ASTL) Tenure-track Researcher
- 2024/02 - 2024/03 Nanyang Technological University visiting researcher
- 2021/12 - 2023/03 Kyoto University master course advisor
- 2019/04 - 2019/05 Oxford University Computer science department visiting researcher
- 2017 - 2019 National Institute of Information and Communications Technology (NICT) Advanced Speech Technology Laboratory (ASTL) researcher (hired by Tokyo Olympic2020 project)
- 2016/04 - 2016/12 Kyoto University, Speech and Audio Processing Lab. researcher (hired by Erica Humanoid robot project)
- 2012/04 - 2012/09 Sogou/Sohu Pinyin IME [Beijing, China] researcher (working on speech input)
- 2009/07 - 2012/04 Shenzhen Institute of Advanced Technology [Shenzhen, Guangdong China] researcher (computer-assisted language learning)
Show all
Committee career (13):
- - 2026 APSIPA Speech, Language, and Audio (SLA) Technical Committee (till 2026)
- 2024/06 - 2024/12 Publicity Chair of ACM Multimedia Asia 2024
- 2024/12 - Co-organizing ACM Multimedia Asia 2024 workshop: Multimodal, Multilingual and Multitask Modeling Technologies for Oriental Languages (M3Oriental) Co-organizer
- 2024/07 - Session Chair of DASFAA2024
- 2024/04 - 2024/04 Session Chair of IEEE-ICASSP2024
- 2023/12 - Co-organizing ACM Multimedia Asia 2023 workshop: Multimodal, Multilingual and Multitask Modeling Technologies for Oriental Languages (M3Oriental)
- 2023/09 - Session Chair of ICANN 2023
- 2023/07 - Area Chair of EMNLP 2023
- 2023/07 - Area Chair of APSIPA ASC 2023
- 2022/10 - Co-organizing Coling2022 workshop: when creative ai meets conversational ai (cai + cai = cai^2)
- 2022/06 - 2022/06 Session Chair for Speaker Odyssey2022 (Evaluation and Benchmarking Session)
- 2020/10 - 2020/10 Session Chair for INTERSPEECH2020 (Topics of ASR I)
- 2020/10 - 2020/10 Co-organizing INTERSPEECH2020 SLIMTS (Spoken Language Interaction for Mobile Transportation System) workshop
Show all
Awards (22):
- 2023/12 - ICASSP2024 ICMC-ASR (In-Car Multi-Channel Automatic Speech Recognition) Challenge top2 in one track
- 2023/12 - 1st place in one track in ASRU2023 special session: VoiceMOS challenge
- 2023/05 - IEEE signal processing society IEEE-SPS grant for IEEE-ICASSP2023 oral presentation (Co-supervised PhD student Qianying Liu)
- 2022 - 1st place in 6 indexes (total 16) of Main/OOD tracks in INTERSPEECH2022 special session: VoiceMOS challenge
- 2021/12 - Oriental language recognition challenge 2021 3rd/4th place in constrained/unconstrained resource multilingual ASR tracks of OLR2021 challenge
- 2021/11 - O-COCOSDA2021 Supervised student (Soky Kak) got best student paper nomination
- 2021/06 - National Institute of Information and Communications Technology (NICT) Outstanding Performance Award Excellence Award (Group)
- 2020/09 - ISCA Travel Grant Staged Knowledge Distillation for End-to-End Dysarthric Speech Recognition and Speech Attribute Transcription
- 2020/09 - ISCA Travel Grant Singing Voice Extraction with Attention based Spectrograms Fusion
- 2020/07 - ICME 2020 best student paper nomination, selected as journal paper in IEEE Trans Multimedia (TMM)
- 2020/05 - National Institute of Information and Communications Technology (NICT) FY 2020 International Development Fund (new proposal score top1)
- 2019 - National Institute of Information and Communications Technology selected as tenure-track researcher with grants (only 3 persons in FY2019)
- 2018 - IEEE Signal Processing Society Japan Student Journal Paper Award
- 2016/03 - Kyoto Univ. 2012-2016 admission/tuition fee total exemption
- 2016 - Paper nominated as ACM/IEEE Trans. Audio, Speech \& Language Process. cover
- 2012 - ポートランド,Interspeech会議へIBM 旅行補助賞金
- 2012 - 京都大学推薦国費留学生特別配置入学
- 2011 - 香港青年起業家プログラムの創造的な企画賞
- 2011 - 中国科学院 職員優秀賞
- 2004 - Nanjing University Encouragement Scholarship
- 2002 - Chen Yinchuan Scholarship (Hongkong) for Excellent University New Students
- 2002 - 中国江蘇省 化学オリンピック二等賞,生物学オリンピック三等賞
Show all
Association Membership(s) (7):
APNNS (Asia Pacific Neural Network Society)
, APSIPA (Asia Pacific Signal and Information Processing Association)
, SIG-CSLP (Chinese Spoken Language Processing)
, ASJ (日本音響学会)
, ISCA (International Speech Communication Association)
, IEEE/IEEE-SPS
, ACM (Association for Computing Machinery)
Return to Previous Page