Announcement_5
Excited to share our paper on benchmarking AI agents for autonomous, repo-level performance engineering! Leading agents really struggle to match human expert performance and often introduce bugs and choose performance shortcuts!
Excited to share our paper on benchmarking AI agents for autonomous, repo-level performance engineering! Leading agents really struggle to match human expert performance and often introduce bugs and choose performance shortcuts!