Human-centric benchmarking on large language models
Project Description
In general, we will test large language models against human language processing. In one project, we will examine the impact of tokenization on sentence comprehension and potential bias or safety issues. In another project, we plan to benchmark models' reasoning abilities.
Supervisor
HSIAO, Janet Hui-wen
Co-Supervisor
HASLETT, David Andrew
Quota
2
Course type
UROP1100
Applicant's Roles
The student will help develop test materials, test the models, collect and analyse the data. Depending on the student's prior experience and interest, it may involve manipulating/fine-tuning the model and observe its behavior.
Applicant's Learning Objectives
To have a deeper understanding of large language models' underlying mechanisms, abilities, and limitations, especially when comparing with human language processing.
Complexity of the project
Challenging