A Multi-Agent Adversarial Framework for Safety Alignment in Chemical Synthesis

Project Description

This project aims to develop an intelligent safety guardrail where heterogeneous LLM agents form a "Safety Committee" to verify synthesis plans. For instance, a Toxicologist (utilizing RAG on toxicology databases) and an Instrument Manager (grounded in instrument manuals) would engage in an adversarial "Propose-Critique-Refine" feedback loop with the planner.

Supervisor

CHENG, Lixue

Quota

Course type

UROP1000

UROP1100

UROP2100

UROP3100

UROP4100

Applicant's Roles

1. Study and adapt the Zen MCP codebase to implement the multi-agent system (Planner, Safety Committee, RAG integration)

2. Develop retrieval-augmented generation (RAG) pipelines for domain-specific safety knowledge grounding.

3. Integrate external tools to obtain chemistry data (PubChem API, vector DB, web search) and conduct experiments, analyze results, and draft the workshop paper

4. Develop evaluation metrics, such as risk interception rate, false positive/negative rates, and run comparative experiments against baseline models (e.g., GPT-5.2, Gemini 3 Pro, Claude Opus 4.5, and DeepSeek V3).

Applicant's Learning Objectives

1. Design and implement a multi-agent adversarial safety system for high-stakes scientific environments.

2. Integrate chemistry knowledge with AI tools to enhance physical grounding and real-world applicability.

3. Gain hands-on experience with tool-augmented LLM agents (RAG, API calls, real-time data fetching).

4. Develop and evaluate a domain-specific safety benchmark, contributing to standardized evaluation in AI-for-science.

Complexity of the project

Challenging

Apply Return home