Using Large Language Models (LLMs) for Software Development | Undergraduate Research Opportunities Program

Project Description

Successes have been reported for code development and maintenance using Large Language Models (LLMs). Examples of LLMs include GPT-4o, Gemini, Claude, Llama, Gemma, and other open local LLMs with the aid of RAG (Retrieval Augmented Generation). At the same time, there are reports that the code generated by these LLMs is error-prone. We are conducting research projects to develop new methodologies of using popular LLMs such as neurosymbolic techniques and agents to solve common software engineering problems, which include program generation, test case generation, vulnerability detection, software fault detection, software fault localization, automated program repair, code summarization, and code refactoring.

Supervisor

CHEUNG, Shing Chi

Quota

4

Course type

UROP1000

UROP1100

UROP2100

UROP3100

UROP3200

UROP4100

Applicant's Roles

Students in this project will assist the research team to
- design, prepare, and conduct experiments using popular LLMs;
- analyze and validate the experimental results; and
- reproduce results reported by earlier research studies.

Applicant's Learning Objectives

Students will learn the following in this project:
- Principles of LLMs,
- Hands-on experience in using popular LLMs and RAG,
- Research methodologies,
- Experience in conducting experiments,
- Experience in analyzing experimental results, and
- Experience in writing technical reports.

Complexity of the project

Challenging