Host-CPU-Aware Automated RISC-V CI Generation via Dynamic Performance Modeling
Project Description
This project develops an automated, workload-driven framework for generating RISC-V custom instructions (CIs) that are aware of the host CPU microarchitecture. The framework takes domain-specific workloads, extracts control-data flow graphs (CDFGs), applies e-graph optimization to enumerate equivalent program variants, and uses a Dependency-Event-Graph (DEG) performance model with dynamic cache and branch-predictor behavior to select CI candidates. Selected CIs are then lowered to the CVXIF accelerator interface and implemented in RTL on the CVA6 core. The work spans: (1) automated CI extraction from workload analysis; (2) cost model improvement by integrating cache and branch-predictor models into the DEG; (3) end-to-end implementation from workload to RTL; and (4) evaluation comparing CPU tuning vs. CI extension. The framework supports joint design-space exploration of CPU parameters and custom instructions.
Supervisor
ZHANG, Wei
Quota
2
Course type
UROP1100
Applicant's Roles
1. CI-to-accelerator implementation: Map selected Cls to CVXIF instruction encodings; design and implement RTL for a CVXIF coprocessor using HLS or HDL; run functional tests.

2. Cost model and CI search: Learn and deploy the baseline cost model and CI search algorithms; use them to evaluate CI candidates and compare performance between baseline and CI-accelerated versions.

3. End-to-end validation: Run the full flow from workload to RTL execution; document evaluation results.
Applicant's Learning Objectives
1. Understand workload-driven CI extraction and the role of CDFGs, e-graphs, and DEG-based cost models.

2. Learn how dynamic behavior (cache, branch prediction) is incorporated into performance models for host-CPU-aware CI selection.

3. Understand the interaction between host CPU microarchitecture (pipeline, issue width, cache hierarchy) and custom instruction design, and how CPU parameters contain accelerator interfaces.

4. Explore design trade-offs for domain-specific accelerators (e.g., performance vs. area vs. power) and how host-CPU-aware cost models support these trade-offs.
Complexity of the project
Moderate