Agents

Agent and Tools

AGENTS has many definitions depending on where you look or which model you use:
- Agents: LLMs with tool access
- Agents: Models using tools in a loop
- In Google's Agent development kit: an agent is a self-contained execution unit designed to operate autonomously to achieve specific goals.
Capabilities:
- Perform tasks independently.
- Interact with users.
- Utilize external tools.
- Coordinate with other agents to complete complex workflows.

Tools are functions that extend the capabilities of LLM-based agents beyond their inherent knowledge and reasoning.
Purpose of Tools:
- Enable interaction with external and real-time data.
- Support querying databases or making API requests.
- Facilitate calculations or data processing.
- Allow execution of specific actions (e.g., creating files, running code).
- Handle authentication processes.

Reference:
- OpenAI - A-practical-guide-to-building-agents.pdf
Complex decision-making:
- Workflows involving nuanced judgment, exceptions, or context-sensitive decisions,
  - for example refund approval in customer service workflows.
Difficult-to-maintain rules:
- Systems that have become unwieldy due to extensive and intricate rulesets, making updates costly or error-prone, 
  - for example performing vendor security reviews.
Heavy reliance on unstructured data:
- Scenarios that involve interpreting natural language, extracting meaning from documents, or interacting with users conversationally,
  - for example processing a home insurance claim.

Coding Assistants:
- Leverage LLMs to write, execute, and refine computer code iteratively.
- Examples:
  - Cursor
  - GitHub Copilot
Search Assistants:
- Conduct multiple searches, gather information, and aggregate results to answer questions or generate reports.
- Examples:
  - Perplexity
  - ChatGPT Search

Workflows are systems where LLMs and tools are orchestrated through predefined code paths.
Building Effective AI Agents \ Anthropic

Decomposes a task into a sequence of steps, where each LLM call processes the output of the previous one.
When to Use it:
- ideal for situations where the task can be easily and cleanly decomposed into fixed subtasks
Examples:
- Generating Marketing copy, then translating it into a different language.
- Writing an outline of a document, checking that the outline meets certain criteria, then writing the document based on the outline.

Routing classifies an input and directs it to a specialized followup task or LLM. This workflow allows for separation of concerns.
When to use this workflow:
- Routing works well for complex tasks where there are distinct categories that are better handled separately, and where classification can be handled accurately, either by an LLM or a more traditional classification model/algorithm.
Examples:
- Directing different types of customer service queries (general questions, refund requests, technical support) into different downstream processes, prompts, and tools.
- Routing easy/common questions to smaller models or Local models and hard/unusual questions to more capable models like Cloud APIs to optimize cost and speed.

LLMs can sometimes work simultaneously on a task and have their outputs aggregated programmatically.
When to use this workflow:
- Parallelization is effective when the divided subtasks can be parallelized for speed, or when multiple perspectives or attempts are needed for higher confidence results.
Examples:
- Automating eval for evaluating LLM performance, where each LLM call evaluates a different aspect of the model’s performance on a given prompt.
- Reviewing a piece of code for vulnerabilities, where several different prompts review and flag the code if they find a problem.

A central LLM dynamically breaks down tasks, delegates them to worker LLMs, and synthesizes their results.
When to use this workflow:
- This workflow is well-suited for complex tasks where you can’t predict the subtasks needed
  - e.g: in coding, for example, the number of files that need to be changed and the nature of the change in each file likely depend on the task
- The difference compared to Parallelization is the subtasks are not known before or pre-defined the start.
Examples:
- Search tasks that involve gathering and analyzing information from multiple sources for possible relevant information.
- Coding products that make complex changes to multiple files each time.

An LLM call generates a response while another provides evaluation and feedback in a loop.
When to use this workflow:
- This workflow is particularly effective when we have clear evaluation criteria, and when iterative refinement provides measurable value.
Example:
- Complex search tasks that require multiple rounds of searching and analysis to gather comprehensive information, where the evaluator decides whether further searches are warranted.