Bio

I am a first-year Ph.D. student at Language Technologies Institute, Carnegie Mellon University, advised by Prof. Shinji Watanabe. My research interest mainly focuses on speech and language, and recently I am interested developing spoken language models. Previously, I was a research assistant at Speech Processing Lab, National Taiwan University. I was also a R&D engineer at MediaTek Inc. working on computer vision tasks such as super-solution and frame-rate conversion (MEMC). I designed and trained lightweight networks which can be run on mobile devices in real-time. I received the M.S. degree from National Taiwan University in 2021. During the time, I joined the Speech Processing Laboratory led by Prof. Lin-shan Lee and Prof. Hung-yi Lee.

Publications

  • Bagpiper: Solving Open-Ended Audio Tasks via Rich Captions

    Jinchuan Tian, Haoran Wang, Bo-Hao Su, Chien-yu Huang(co-first), Qingzheng Wang, Jiatong Shi, William Chen, Xun Gong, Siddhant Arora, Chin-Jou Li, Masao Someki, Takashi Maekaku, Yusuke Shinohara, Jin Sakuma, Chao-Han Huck Yang, Shinji Watanabe
    Preprint 2026
  • DeSTA2.5-Audio: Toward General-Purpose Large Audio Language Model with Self-Generated Cross-Modal Alignment

    Ke-Han Lu, Zhehuai Chen, Szu-Wei Fu, Chao-Han Huck Yang, Sung-Feng Huang, Chih-Kai Yang, Chee-En Yu, Chun-Wei Chen, Wei-Chih Chen, Chien-yu Huang, Yi-Cheng Lin, Yu-Xiang Lin, Chi-An Fu, Chun-Yi Kuan, Wenze Ren, Xuanjun Chen, Wei-Ping Huang, En-Pei Hu, Tzu-Quan Lin, Yuan-Kuei Wu, Kuan-Po Huang, Hsiao-Ying Huang, Huang-Cheng Chou, Kai-Wei Chang, Cheng-Han Chiang, Boris Ginsburg, Yu-Chiang Frank Wang, Hung-yi Lee
    Preprint 2025
  • Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks

    Chien-yu Huang, Wei-Chih Chen, Shu-wen Yang, Andy T. Liu, Chen-An Li, Yu-Xiang Lin, Wei-Cheng Tseng, et al.
    The Thirteenth International Conference on Learning Representations 2025
  • SpeechCaps: Advancing Instruction-Based Universal Speech Models with Multi-Talker Speaking Style Captioning

    Chien-yu Huang, Min-Han Shih, Ke-Han Lu, Chi-Yuan Hsiao, Hung-yi Lee
    IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2025
  • A Preliminary Exploration with GPT-4o Voice Mode

    Yu-Xiang Lin, Chih-Kai Yang, Wei-Chih Chen, Chen-An Li, Chien-yu Huang, Xuanjun Chen, Hung-yi Lee
    Preprint 2025
  • Fusion Of Discrete Representations and Self-Augmented Representations for Multilingual Automatic Speech Recognition

    Shih-Heng Wang, Jiatong Shi, Chien-yu Huang, Shinji Watanabe, Hung-yi Lee
    IEEE Spoken Language Technology Workshop 2024
  • Dynamic-SUPERB: Towards A Dynamic, Collaborative, and Comprehensive Instruction-Tuning Benchmark for Speech

    Chien-yu Huang, Ke-Han Lu, Shih-Heng Wang, Chi-Yuan Hsiao, Chun-Yi Kuan, Haibin Wu, Siddhant Arora, Kai-Wei Chang, Jiatong Shi, Yifan Peng, Roshan Sharma, Shinji Watanabe, Bhiksha Ramakrishnan, Shady Shehata, Hung-yi Lee
    IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024
  • Prompting and Adapter Tuning for Self-supervised Encoder-Decoder Speech Model

    Kai-Wei Chang, Ming-Hsin Chen, Yun-Ping Lin, Jing Neng Hsu, Paul Huang, Chien-yu Huang, Shang-Wen Li, Hung-yi Lee
    IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) 2023
  • Toward Degradation-Robust Voice Conversion

    Chien-yu Huang, Kai-Wei Chang, Hung-yi Lee
    IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022
  • Utilizing Self-supervised Representations for MOS Prediction

    Wei-Cheng Tseng(co-first), Chien-yu Huang(co-first), Wei-Tsung Kao, Yist Y Lin, Hung-yi Lee
    Interspeech 2021
  • Investigating on Incorporating Pretrained and Learnable Speaker Representations for Multi-Speaker Multi-Style Text-to-Speech

    Chung-Ming Chien, Jheng-Hao Lin, Chien-yu Huang, Po-chun Hsu, Hung-yi Lee
    IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021
  • How Far Are We from Robust Voice Conversion: A Survey

    Tzu-hsien Huang, Jheng-hao Lin, Chien-yu Huang, Hung-yi Lee
    IEEE Spoken Language Technology Workshop 2021
  • Defending Your Voice: Adversarial Attack on Voice Conversion

    Chien-yu Huang, Yist Y. Lin, Hung-yi Lee, Lin-shan Lee
    IEEE Spoken Language Technology Workshop 2021

Honors

  • 2nd Place, M2VoC Challenge

        

  • Advanced Speech Technologies Scholarship

        

  • Excellence Achievement in AI CUP Competition

        

  • Dean’s List (Twice)