Developing deep learning enabled nucleic acid structural generator
Project Description
Sophisticated experimental approaches have been developed to determine structural details of biomolecules. The structures of about 170,000 proteins have been solved over the last 60 years, while there are over 200 million known proteins across all life forms. The AI offers a unprecedented capacity to fill in this gap. For instance, the newly developed deep learning enabled structure generator outperforms the all-atom/homology modelling counterparts. This project aims to further enhance the accuracy and coverage of transformer-based large language models; such structural information is crucial for elucidating the functions of naturally occurring nucleic acids and guiding synthetic nucleic acid design.
Supervisor
SU, Haibin
Quota
3
Course type
UROP1000
UROP1100
UROP2100
UROP3100
UROP3200
UROP4100
Applicant's Roles
Collecting and processing nucleic acid sequences from online databases; training and testing the transformer-based structural generator.
Applicant's Learning Objectives
Training in manipulation of nucleic acid sequences and deep learning model building; gaining research experience on biomolecular structure prediction.
Complexity of the project
Moderate