Deep learning in natural language processing
Project Description
In recent years, computer science researchers have introduced large language models (LLM), which are deep-learning based NLP algorithms, such as Google's BERT, that consider word contexts, e.g., other words in the same text, and word sequences in summarizing texts. These researchers show that LLM can significantly outperform simpler NLP models that assume a bag-of-words structure in tasks such as sentiment classification in general texts, language translation, and question answering. In this project, we will label the sentiments of a dataset of financial texts that mention multiple entities (companies) and use the dataset to train and fine-tune several NLP algorithms including LLM. One example of such LLM is FinBERT (https://finbert.ai/), a pre-trained deep learning NLP algorithm specific to finance domain.
Supervisor
HUANG, Allen
Co-Supervisor
YANG Yi
Quota
10
Course type
UROP1000
UROP1100
UROP2100
UROP3100
UROP4100
Applicant's Roles
- Label texts for NLP task
- Train and fine-tune deep learning models for the NLP task
- Applicants should be familiar with Python
- Train and fine-tune deep learning models for the NLP task
- Applicants should be familiar with Python
Applicant's Learning Objectives
Students will gain experience in the lifecycle of data analysis in financial textual analysis, including data cleaning and Python programming and natural language processing.
Complexity of the project
Challenging