Retrieval Augmented Generation with Vector Database
Project Description
This project focuses on enhancing language model generation through the integration of a vector database for retrieval-augmented generation (RAG). The aim is to improve the accuracy and relevance of generated text by incorporating contextual data retrieved from a vectorized database. The project will involve developing algorithms for efficient data retrieval, integrating these with language models, and optimizing the interaction between the retrieval system and text generation. The outcome will be a sophisticated system capable of producing high-quality, contextually relevant text, suitable for various applications in natural language processing and AI.
Supervisor
ZHOU, Xiaofang
Quota
3
Course type
UROP1100
UROP2100
UROP3100
UROP3200
UROP4100
Applicant's Roles
Responsible for designing and implementing data collection algorithms, ensuring the diversity and quality of the dataset. Must be experienced in coding, machine learning, and data engineering. In charge of integrating the retrieval system with language generation models, ensuring seamless interaction and data flow.
Applicant's Learning Objectives
Gain in-depth knowledge and practical experience in the design and implementation of retrieval-augmented generation systems. Develop skills in managing and optimizing vector databases, crucial for efficient data retrieval in AI applications.
Complexity of the project
Challenging