Generation of adversarial chemical reactions
Project Description
Recent years machine learning has been widely applied in the field of chemistry to promote discovery and synthesis. One of the major challenges with chemical reaction data is that there are no high quality negative reaction data in the existing large reaction databases, since people tend to only report/record successful reactions in chemistry. The generation of negative reaction data is thus useful for balancing the current available datasets and provide opportunities to improve reaction prediction models.

Generative adversarial networks have been successfully used to generate adversarial examples that are of the quality of true data, especially in the image generation area. They have also been used in molecular generation to aid the development of new molecules. Limited application has been seen, though, in the generation of chemical reactions.
GAO Hanyu
Course type
Applicant's Roles
In this project, you will use advanced generative machine learning models (like the generative adversarial network, GAN) to generate adversarial reaction examples. Large databases of true reactions (like the USPTO data) will be used for this purpose. The model will be used for new reaction discovery as well as improving the reaction filtering model to identify unrealistic reaction in large-scale reaction data explorations.
Applicant's Learning Objectives
Through working on the project, you are expected to gain the following experiences/expertise:
1. Programming;
2. Knowledge about organic chemistry and chemical reactions;
3. Machine learning (especially generative models) algorithms.
Complexity of the project