Announcement_5 | Jeffrey J. Ma

Excited to share our paper on benchmarking AI agents for autonomous, repo-level performance engineering! Leading agents really struggle to match human expert performance and often introduce bugs and choose performance shortcuts!