IBM Research released the open-source CUGA agent harness, packaged with 24 single-file working example applications designed to eliminate the repetitive plumbing work required to build agentic apps. The tool, installable via pip install cuga, handles core orchestration, state management, and tool binding so developers only need to define agent goals and permitted tool access. IBM Research’s official CUGA Apps announcement
CUGA targets developers building agentic workflows who would otherwise spend time wiring together model clients, tool adapters, state tracking, and guardrails before writing task-specific logic. Each of the 24 example apps fits in a single FastAPI file, with the full agent definition requiring only four function arguments. IBM Research’s official CUGA Apps announcement
Unlike generic agent frameworks that require manual wiring of model clients, tool adapters, and state management, CUGA ships with pre-built orchestration for pre-action planning, execution, and self-correction. For long-horizon tasks, the harness tracks intermediate variables and runs a reflection step to catch bad tool calls and re-plan instead of pushing forward with incorrect data.
A key differentiator from prevailing industry frameworks is that CUGA pushes planning and error-recovery workload onto the harness rather than relying on the underlying LLM to correct mistakes. The IBM Research team credits this design for first-place rankings on the AppWorld and WebArena benchmarks. IBM Research’s official CUGA Apps announcement
The harness executes tasks via a mix of pre-defined tool calls and generated code (CodeAct), allowing agents to perform complex operations like data transformation or API integration without pre-built tool adapters. This combination of structured tool use and dynamic code generation is what allows CUGA to handle tasks that would typically require custom workflow coding for each use case.
This design also allows smaller open-weight models like gpt-oss-120b to perform reliably on long-running tasks where they would typically fail with lighter orchestration layers. IBM Research’s official CUGA Apps announcement
Developers configure cost and latency tradeoffs via declarative settings, with three built-in reasoning modes (Fast, Balanced, Accurate) and support for code execution in local, Docker/Podman, or E2B cloud sandboxes. IBM Research’s official CUGA Apps announcement
To demonstrate low implementation overhead, IBM Research published 24 working example apps for the harness, each contained in a single FastAPI file with no external configuration required. The full gallery of examples is hosted publicly on Hugging Face, with each app’s repository containing the single FastAPI file and a minimal UI. IBM Research’s official CUGA Apps announcement
One sample app, an IBM Cloud architecture advisor, recommends IBM Cloud services by binding a custom inline tool that queries the IBM Cloud Global Catalog alongside generic web search tools loaded via the Model Context Protocol (MCP).
The full agent definition for the app requires only four arguments passed to the CugaAgent constructor: a model provider (configurable via environment variable to support OpenAI, Anthropic, watsonx, LiteLLM, or Ollama), the tool list, custom system instructions, and a folder for state and policy storage. IBM Research’s official CUGA Apps announcement
The harness supports interchangeable tool bindings for OpenAPI specs, MCP servers, and LangChain functions, all configured via the same interface, so teams can reuse existing tooling without rewriting adapters. Across all examples, generic stateless capabilities like web search are pulled from shared MCP servers, while app-specific logic is defined as standard Python functions whose docstrings guide the agent’s tool selection.
No custom framework-specific syntax is required, and any developer familiar with writing FastAPI routes can read and modify every line of the example code. IBM Research’s official CUGA Apps announcement
For teams moving beyond prototyping, CUGA includes built-in production governance features that require no changes to the core agent code. The harness supports declarative guardrails, multi-agent delegation via the Agent-to-Agent (A2A) protocol, and Docling-powered retrieval-augmented generation (RAG) for document-grounded responses, all configurable via policy files stored in the agent’s designated cuga_folder. IBM Research’s official CUGA Apps announcement
Because model provider switching is handled via a single environment variable, the same agent definition can run on a local open-weight model for testing and a governed enterprise model in production without any code changes. IBM Research’s official CUGA Apps announcement
The cuga-apps gallery is available for public browsing and forking on Hugging Face, with IBM Research encouraging developers to adapt the examples for custom use cases. The harness itself is open-source and free to use, with no mandatory vendor lock-in for underlying model providers or deployment infrastructure. IBM Research’s official CUGA Apps announcement
Bottom line: For developers looking to reduce time spent on agentic app orchestration plumbing, CUGA’s 24 single-file FastAPI examples provide a copy-and-adapt starting point that supports configurable reasoning modes, interchangeable tool bindings, and portability between local and enterprise model providers without code rewrites.
