Boyi Li

Email: boyilics [at] gmail [dot] com

Brief Bio

I am a Research Scientist at NVIDIA Research and a Postdoctoral Scholar at UC Berkeley, advised by Prof. Jitendra Malik and Prof. Trevor Darrell.

I received my Ph.D. at Cornell University, advised by Prof. Serge Belongie and Prof. Kilian Q. Weinberger.

In my research, the main objective is to develop generalizable algorithms and interactive intelligent systems, focusing on reasoning, large language models, generative models and robotics, by aligning representations from multimodal data, such as 2D pixels, 3D geometry, language, audio, touch, and smell.

Selected Publications

Boyi Li, Jathushan Rajasegaran, Yossi Gandelsman, Alexei A. Efros, Jitendra Malik

Synthezing Moving People with 3D Control

Arxiv, 2024

Paper · Project Webpage · Code

Boyi Li, Yue Wang, Jiageng Mao, Boris Ivanovic, Sushant Veer, Karen Leung, Marco Pavone

LLaDA: Driving Everywhere with Large Language Model Policy Adaptation

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024

Paper · Project Webpage · Video · Featured in NVIDIA GTC · NVIDIA Official Video · Bilibili

Tsung-Han Wu*, Long Lian*, Joseph E. Gonzalez, Boyi Li†, Trevor Darrell†

Self-correcting LLM-controlled Diffusion Models

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024

Paper · Project Webpage · Video · Code

Long Lian*, Baifeng Shi*, Adam Yala†, Trevor Darrell†, Boyi Li†

LLM-grounded Video Diffusion Models

International Conference on Learning Representations (ICLR), 2024

Paper · Project Webpage · Code

Sihyun Yu, Weili Nie, De-An Huang, Boyi Li, Jinwoo Shin, Anima Anandkumar

CMD: Efficient Video Diffusion Models via Content-Frame Motion-Latent Decomposition

International Conference on Learning Representations (ICLR), 2024

Paper · Project Webpage · Code

Jiawei Yang, Boris Ivanovic, Or Litany, Xinshuo Weng, Seung Wook Kim, Boyi Li,

Tong Che, Danfei Xu, Sanja Fidler, Marco Pavone, Yue Wang

EmerNeRF: Emergent Spatial-Temporal Scene Decomposition via Self-Supervision

International Conference on Learning Representations (ICLR), 2024

Paper · Project Webpage · Code · NVIDIA Official Video

Long Lian, Boyi Li, Adam Yala, Trevor Darrell

LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models

Transactions on Machine Learning Research (TMLR), Featured Certification, 2024

Workshop on Knowledge and Logical Reasoning in the Era of Data-driven Learning at ICML, 2023

Paper · Project Webpage · Code · BAIR Blog · Hugging Face Demo

Boyi Li*, Rodolfo Corona*, Karttikeya Mangalam*, Catherine Chen*, Daniel Flaherty,

Serge Belongie, Kilian Q. Weinberger, Jitendra Malik, Trevor Darrell, Dan Klein

Re-evaluating the Need for Multimodal Signals in Unsupervised Grammar Induction

Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL Findings), 2024

Paper

Boyi Li*, Philipp Wu*, Pieter Abbeel, Jitendra Malik

Interactive Task Planning with Language Models

Workshop on Language and Robot Learning Language as Grounding at CoRL, 2023

Paper · Project Webpage · Code · Video

Jiaxin Ge, Sanjay Subramanian, Trevor Darrell†, Boyi Li†

From Wrong To Right: A Recursive Approach Towards Vision-Language Explanation

Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Paper

Boyi Li, Yin Cui, Tsung-Yi Lin, Serge Belongie

SITTA: Single Image Texture Translation for Data Augmentation

European Conference on Computer Vision (ECCV) Workshops, 2022

Paper · Code

Boyi Li, Kilian Q. Weinberger, Serge Belongie, Vladlen Koltun, René Ranftl

Language-driven Semantic Segmentation

International Conference on Learning Representations (ICLR), 2022

Paper · Project Webpage · Code · Demo

Boyi Li, Serge Belongie, Ser-nam Lim, Abe Davis

Neural Image Recolorization for Creative Domains

5th Workshop on Computer Vision for Fashion, Art, and Design at CVPR, Oral, 2022

Paper · Project Webpage

Boyi Li*, Felix Wu*, Ser-nam Lim, Serge Belongie, Kilian Q. Weinberger

On Feature Normalization and Data Augmentation

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021

Paper · Project Webpage · Code · Video

Boyi Li*, Felix Wu*, Kilian Q. Weinberger, Serge Belongie

Positional Normalization

Neural Information Processing Systems (NeurIPS), Spotlight, 2019

Paper · Project Webpage · Code · Video

Miscellaneous

Classical music (violin/piano), painting, interior design, singing, and raising cute animals.