About Me

I’m a 4th-year PhD (2020-now) student at the Department of Computer Science & Engineering, Hong Kong University of Science and Technology, co-supervised by Prof. Heung-Yeung Shum and Prof. Lionel M. Ni. I interned at International Digital Economy Academy, Shenzhen (advised by Prof. Lei Zhang) and Microsoft Research, Redmond (advised by Dr. Jianwei Yang and Dr. Chunyuan Li). Previously, I obtained my bachelor’s degree from School of Electronic Information and Electrical Engineering in Shanghai Jiao Tong University in 2019.

📌My research interests lie in large multi-modal models, visual understanding and generation.

✉️ Welcome to contact me for any discussion and cooperation!

🔥 News

📝 Recent Works

Refer to my google scholar for the full list.

  • LLaVA-Grounding: Grounded Visual Chat with Large Multimodal Models.
    Hao Zhang*, Hongyang Li*, Feng Li, Tianhe Ren, Xueyan Zou, Shilong Liu, Shijia Huang, Jianfeng Gao, Lei Zhang, Chunyuan Li, Jianwei Yang.
    arxiv 2023.
    [Paper][Code]

  • Semantic-SAM: Segment and Recognize Anything at Any Granularity.
    Feng Li*, Hao Zhang*, Peize Sun, Xueyan Zou, Shilong Liu, Jianwei Yang, Chunyuan Li, Lei Zhang, Jianfeng Gao.
    arxiv 2023.
    [Paper][Code]

  • SoM: Set-of-Mark Visual Prompting for GPT-4V.
    Jianwei Yang*, Hao Zhang*,Feng Li*, Xueyan Zou*, Chunyuan Li, Jianfeng Gao.
    arxiv 2023.
    [Paper][Code]

  • OpenSeeD: A Simple Framework for Open-Vocabulary Segmentation and Detection.
    Hao Zhang*, Feng Li*, Xueyan Zou*, Shilong Liu, Chunyuan Li, Jianfeng Gao, Jianwei Yang, Lei Zhang. ICCV 2023.
    [Paper][Code]

  • SEEM: Segment Everything Everywhere All at Once.
    Xueyan Zou*, Jianwei Yang*, Hao Zhang*, Feng Li*, Linjie Li, Jianfeng Gao, Yong Jae Lee.
    NeurIPS 2023.
    [Paper][Code]

  • Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection.
    Shilong Liu, Zhaoyang Zeng, Tianhe Ren, Feng Li, Hao Zhang, Jie Yang, Chunyuan Li, Jianwei Yang, Hang Su, Jun Zhu, Lei Zhang.
    arxiv 2023.
    [Paper][Code]

  • Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation.
    Feng Li*, Hao Zhang*, Huaizhe Xu, Shilong Liu, Lei Zhang, Lionel M. Ni, Heung-Yeung Shum.
    CVPR 2023.
    [Paper][Code]

  • DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection.
    Hao Zhang*, Feng Li*, Shilong Liu*, Lei Zhang, Hang Su, Jun Zhu, Lionel M. Ni, Heung-Yeung Shum.
    ICLR 2023.
    [Paper][Code] Rank 2nd on ICLR 2023 Most Inflentical Papers

  • DN-DETR: Accelerate DETR Training by Introducing Query DeNoising.
    Feng Li*, Hao Zhang*, Shilong Liu, Jian Guo, Lionel M. Ni, Lei Zhang.
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2022. Oral presentation.
    [Paper][Code]

(* denotes equal contribution.)

🎖 Selected Awards

  • RedBird Research Scholarship in HKUST, 2020
  • Hong Kong Postgraduate Scholoarship, 2020, 2021, 2022 and 2023

Flag Counter

Flag Counter