STREAM

STREAM Self Tested Rewriting Evolutionary Adaptive Machine

The STREAM platform uses Self-refining Self-referential Coding Agents (SSCAs) which iteratively rewrite and empirically evaluate its own code. They target enhanced performance on a range of tasks (evaluated on coding benchmarks including SWE-bench and Polyglot) to instantiate a theoretical Gödel Machine[1] concept with applied evolution – using empirical experiments to evidence that a proposed new version enhances performance.

AI Refined derives from a research and development initiative at University College London centred on applications of STREAM in high security environments.

STREAM continuously evolves and improves its capabilities using Darwinian evolutionary paradigms of variation, selection and inheritance:

  • Variation: generates various mutations, or code modifications, to its own codebase.
  • Selection: evaluates the performance of these modified versions on coding benchmarks.
  • Inheritance: maintains an archive of all discovered agent variants, analogous to a biological gene pool, allowing future generations to branch from promising lineages and avoid getting stuck at local maxima.

STREAM’s self-referential self-improving coding agent iteratively tests its coding capabilities, writing and modifying its own code to become a better coding agent; each self-modification requires STREAM to edit and evaluate its own codebase.

During a self-modification, selected coding agents generate modified versions of themselves. During an evaluation phase, each modified agent is tested on a coding benchmark, estimating the agent’s coding capabilities, and is then added to an archive. By improving its capabilities through this loop, STREAM becomes better at both solving coding tasks and making future self-improvements.

 

[1] Gödel machines are conceptual self-improving programs that search for ways to modify themselves if they can quantitatively prove that code changes will lead to better performance in achieving a designated utility function.