A Multi-Agent Adversarial Framework for Safety Alignment in Chemical Synthesis
Project Description
This project aims to develop an intelligent safety guardrail where heterogeneous LLM agents form a "Safety Committee" to verify synthesis plans. For instance, a Toxicologist (utilizing RAG on toxicology databases) and an Instrument Manager (grounded in instrument manuals) would engage in an adversarial "Propose-Critique-Refine" feedback loop with the planner.
Supervisor
CHENG, Lixue
Quota
2
Course type
UROP1000
UROP1100
UROP2100
UROP3100
UROP4100
Applicant's Roles
1. Study and adapt the Zen MCP codebase to implement the multi-agent system (Planner, Safety Committee, RAG integration)

2. Develop retrieval-augmented generation (RAG) pipelines for domain-specific safety knowledge grounding.

3. Integrate external tools to obtain chemistry data (PubChem API, vector DB, web search) and conduct experiments, analyze results, and draft the workshop paper

4. Develop evaluation metrics, such as risk interception rate, false positive/negative rates, and run comparative experiments against baseline models (e.g., GPT-5.2, Gemini 3 Pro, Claude Opus 4.5, and DeepSeek V3).
Applicant's Learning Objectives
1. Design and implement a multi-agent adversarial safety system for high-stakes scientific environments.

2. Integrate chemistry knowledge with AI tools to enhance physical grounding and real-world applicability.

3. Gain hands-on experience with tool-augmented LLM agents (RAG, API calls, real-time data fetching).

4. Develop and evaluate a domain-specific safety benchmark, contributing to standardized evaluation in AI-for-science.
Complexity of the project
Challenging