New AI agent evolves algorithms for math and practical applications in computing by combining the creativity of large language models with automated evaluators
AlphaEvolve verifies, runs and scores the proposed programs using automated evaluation metrics. These metrics provide an objective, quantifiable assessment of each solution’s accuracy and quality.
Yeah, that’s the way genetic algorithms have worked for decades. Have they figured out a way to turn those evaluation metrics directly into code improvements, or do they just keep doing a bunch of rounds of trial and error?
The general framework for evolutionary methods/genetic algorithms is indeed old but it’s extremely broad. What matters is how you actually mutate the algorithm being run given feedback. In this case, they’re using the same framework as genetic algorithms (iteratively building up solutions by repeatedly modifying an existing attempt after receiving feedback) but they use an LLM for two things:
Overall better sampling (the LLM has better heuristics for figuring out what to fix compared to handwritten techniques), meaning higher efficiency at finding a working solution.
“Open set” mutations: you don’t need to pre-define what changes can be made to the solution. The LLM can generate arbitrary mutations instead. In particular, AlphaEvolve can modify entire codebases as mutations, whereas prior work only modified single functions.
The “Related Work” (section 5) section of their whitepaper is probably what you’re looking for, see here.
Yeah, that’s the way genetic algorithms have worked for decades. Have they figured out a way to turn those evaluation metrics directly into code improvements, or do they just keep doing a bunch of rounds of trial and error?
The general framework for evolutionary methods/genetic algorithms is indeed old but it’s extremely broad. What matters is how you actually mutate the algorithm being run given feedback. In this case, they’re using the same framework as genetic algorithms (iteratively building up solutions by repeatedly modifying an existing attempt after receiving feedback) but they use an LLM for two things:
Overall better sampling (the LLM has better heuristics for figuring out what to fix compared to handwritten techniques), meaning higher efficiency at finding a working solution.
“Open set” mutations: you don’t need to pre-define what changes can be made to the solution. The LLM can generate arbitrary mutations instead. In particular, AlphaEvolve can modify entire codebases as mutations, whereas prior work only modified single functions.
The “Related Work” (section 5) section of their whitepaper is probably what you’re looking for, see here.
We gotta use up as much fresh water for cooling as we can. 🤷♂️