source: "raw/articles/evoskill-automated-skill-discovery-for-multi-agent-systems.md"

Summary: EvoSkill - Automated Skill Discovery for Multi-Agent Systems

TL;DR: EvoSkill automatically discovers and refines reusable agent skills through iterative failure analysis, improving performance on OfficeQA (+7.3%) and SealQA (+12.1%) while producing skills that transfer zero-shot to new tasks.

Key Points

Three-agent framework: Executor (runs tasks), Proposer (analyzes failures), and Skill-Builder (materializes skills)
Maintains Pareto frontier of top-k agent programs, retaining only skills that improve validation performance
Uses textual feedback descent to evolve skills rather than optimizing low-level artifacts like prompts or code
Skills are structured as folders containing SKILL.md instructions, metadata, and optional helper scripts
Evaluated on OfficeQA (grounded reasoning over Treasury data) achieving 60.6% → 67.9% accuracy
On SealQA (search-augmented QA with noisy retrieval) achieved 26.6% → 38.7% accuracy
Skills evolved on SealQA transferred zero-shot to BrowseComp, improving accuracy by 5.3%
Uses round-robin parent selection from frontier and stratified data partitioning for training/validation/test splits
Git-based version control tracks program lineage with each agent configuration as a branch

Concepts Covered

Agent Skills — reusable, domain-specific workflows and code that augment coding agents
Textual Feedback Descent — optimization framework using natural language feedback rather than scalar rewards
Pareto Frontier — maintains bounded set of top-performing programs for selection
Zero-Shot Transfer — skills generalize to unseen tasks without modification
Multi-Agent Systems — collaborative framework with specialized proposer and builder agents
Evolutionary Optimization — iterative mutation and selection of agent capabilities
Grounded Reasoning — OfficeQA benchmark requiring navigation of complex Treasury documents
Search-Augmented QA — SealQA benchmark testing retrieval under adversarial conditions

Images/Figures

— Overview of EvoSkill process
— Detailed loop diagram showing three agents and iteration process
— Performance results on OfficeQA across training splits and tolerance levels

source: "raw/articles/evoskill-automated-skill-discovery-for-multi-agent-systems.md"

Summary: EvoSkill - Automated Skill Discovery for Multi-Agent Systems

Key Points

Concepts Covered

Images/Figures

Related Concepts