Dong Li

I am a junior researcher at Shanghai AI Lab., where I work on multimodal machine learning and large language model.

Research

I'm interested in Multimodal Machine Learning (MML), large language models (LLMs), efficient Transformers and optimization. Much of my research is about MML and LLM . Representative papers are highlighted.

	Fine-grained Audible Video Description Xuyang Shen, Dong Li, Jinxing Zhou, Zhen Qin, Bowen He, Xiaodong Han, Aixuan Li, Yuchao Dai, Lingpeng Kong, Meng Wang, Yu Qiao, Yiran Zhong CVPR*, 2023 project page / video / arXiv We explore a new task for audio-visual-language modeling called fine-grained audible video description (FAVD). It aims to provide detailed textual descriptions for the given audible videos, including the appearance and spatial locations of each object, the actions of moving objects, and the sounds in videos.
	Toeplitz Neural Network for Sequence Modeling Zhen Qin, Xiaodong Han, Weixuan Sun, Bowen He, Dong Li, Dongxu Li, Yuchao Dai, Lingpeng Kong, Yiran Zhong ICLR, 2023 (notable-top-25%) arXiv / code We propose to model sequences with a relative position encoded Toeplitz matrix and use a Toeplitz matrix-vector production trick to reduce the space-time complexity of the sequence modeling to log linear.
	Neural Architecture Search on Efficient Transformers and Beyond Zexiang Liu, Dong Li, Kaiyue Lu, Zhen Qin, Weixuan Sun, Jiacheng Xu, Yiran Zhong arXiv*, 2022 arXiv We propose a new framework to find optimal architectures for efficient Transformers with the neural architecture search (NAS) technique.
	Linear Video Transformer with Feature Fixation Kaiyue Lu, Zexiang Liu, Jianyuan Wang, Weixuan Sun, Zhen Qin, Dong Li, Xuyang Shen, Hui Deng, Xiaodong Han, Yuchao Dai, Yiran Zhong arXiv, 2022 arXiv We propose a feature fixation module to reweight the feature importance of the query and key before computing linear attention.

Feel free to steal this website's source code. Do not scrape the HTML from this page itself, as it includes analytics tags that you do not want on your own website — use the github code instead. Also, consider using Leonid Keselman's Jekyll fork of this page.