Exploring LLMs/VLMs long-form understanding capability | Undergraduate Research Opportunities Program

Project Description

While today’s large language models can breeze through paragraphs and even multi-page articles, their grasp of truly long-form text—treatises that stretch across hundreds of pages, patient records that unfold over years, or sprawling literary works threaded with recurring motifs—remains murky. This project asks: what does “long-form language understanding” really mean, and how far do current LLMs actually stretch? We will dissect state-of-the-art models’ working memory, global coherence, and narrative acuity by pushing them through entire novels, full-length legislative bills, and unsegmented chat logs that span months. Our aim is not to win another leaderboard but to map the conceptual terrain—spotting where today’s architectures glide, where they stumble, and which inductive biases, memory tricks, or prompting rituals might push them further. The investigation is exploratory and research-heavy; it calls for a fascination with textual cognition, strong engineering chops, and the patience to untangle messy, open-ended outcomes.

Supervisor

SONG Yangqiu

Quota

5

Course type

UROP1000

UROP1100

UROP2100

UROP3100

UROP3200

UROP4100

Applicant's Roles

Working together with a PhD student on task formulation, designing experiments, analyzing results, and writing research papers.

Applicant's Learning Objectives

Have hands-on experience in playing with VLMs. Learn how to research with them for diverse reasoning scenarios.

Complexity of the project

Challenging