About me

I am a Master’s in Language Technologies student at Carnegie Mellon University’s Language Technologies Institute. I am advised by Professor Shinji Watanabe, as a part of the Audio and Voice Lab. My work at CMU has focused on building large-scale speech foundation models. My research interests include multilingual speech recognition, speech translation, and long-form speech processing.

Previously, I was a Software Engineer at Texas Instruments, as part of the Ti.com e-commerce team. This collection of articles details some of the work I did on the team.

I received my BS in Computer Science and BA in History from the University of Central Florida in May 2021.

Recent News

  • I will be attending ASRU 2023 in Taiwan this December
  • Check out my blog post on speech foundation models

Past Positions

From May 2023 to August 2023, I was a Research Intern at the NTT Communication Sciences Lab in Japan, supervised by Marc Delcroix, Atsunori Ogawa, and Takatomo Kano.

From April 2021 to July 2021, I was part of the UCF Security and Analytics Lab, supervised by Professor David Mohaisen.

From June 2020 to July 2022, I was a Research Assistant at the UCF Computational Biology Lab, supervised by Professor Wei Zhang.

From January 2020 to October 2021, I was part of the UCF Evolutionary Computation Lab, supervised by Professor Annie Wu.

Selected Publications

Google Scholar will be more up-to-date.

Speech Foundation Models

Joint Prediction and Denoising for Large-Scale Multilingual Self-Supervised Learning
William Chen, Jiatong Shi, Brian Yan, Dan Berrebbi, Wangyou Zhang, Yifan Peng, Xuankai Chang, Soumi Maiti, Shinji Watanabe
ASRU 2023

Reproducing Whisper-Style Training Using an Open-Source Toolkit and Publicly Available Data
Yifan Peng, Jinchuan Tian, Brian Yan, Dan Berrebbi, Xuankai Chang, Xinjian Li, Jiatong Shi, Siddhant Arora, William Chen, Roshan Sharma, Wangyou Zhang, Yui Sudo, Muhammad Shakeel, Jee-weon Jung, Soumi Maiti, Shinji Watanabe
ASRU 2023

YODAS: Youtube-Oriented Dataset for Audio and Speech
Xinjian Li, Shinnosuke Takamichi, Takaaki Saeki, William Chen, Sayaka Shiota, Shinji Watanabe
ASRU 2023

Reducing Barriers to Self-Supervised Learning: HuBERT Pre-training with Academic Compute
William Chen, Xuankai Chang, Yifan Peng, Zhaoheng Ni, Soumi Maiti, Shinji Watanabe

ML-SUPERB: Multilingual Speech Universal PERformance Benchmark
Jiatong Shi, Dan Berrebbi, William Chen, Ho-Lam Chung, En-Pei Hu, Wei Ping Huang, Xuankai Chang, Shang-Wen Li, Abdelrahman Mohamed, Hung-yi Lee, Shinji Watanabe

Multilingual Speech Recognition

Improving Massively Multilingual ASR With Auxiliary CTC Objectives
William Chen, Brian Yan, Jiatong Shi, Yifan Peng, Soumi Maiti, Shinji Watanabe

Speech Translation

CMU’s IWSLT 2023 Simultaneous Speech Translation System
Brian Yan*, Jiatong Shi*, Soumi Maiti, William Chen, Xinjian Li, Yifan Peng, Siddhant Arora, Shinji Watanabe
IWSLT 2023

QUESPA Submission for the IWSLT 2023 Dialect and Low-resource Speech Translation Tasks
John E. Ortega, Rodolfo Zevallos, William Chen
IWSLT 2023

Long-Form Speech Processing

Train Long and Test Long: Leveraging Full Document Contexts in Speech Processing
William Chen, Takatomo Kano, Atsunori Ogawa, Marc Delcroix, Shinji Watanabe
To appear in ICASSP 2024

AugSumm: Towards Generalizable Speech Summarization Using Synthetic Labels from Large Language Model
Jee-weon Jung, Roshan Sharma, William Chen, Bhiksha Raj, Shinji Watanabe
To appear in ICASSP 2024

ESPNet-SUMM: Introducing a novel large dataset, toolkit, and a cross-corpora evaluation of speech summarization systems
Roshan Sharma, William Chen, Takatomo Kano, Ruchira Sharma, Atsunori Ogawa, Siddhant Arora, Marc Delcroix, Rita Singh, Shinji Watanabe, Bhiksha Raj
ASRU 2023

Summarize while Translating: Universal Model with Parallel Decoding for Summarization and Translation
Takatomo Kano, Atsunori Ogawa, Marc Delcroix, Kohei Matsuura, Takanori Ashihara, William Chen, Shinji Watanabe
ASRU 2023