Velocity by Booz Allen

How Will Agents Be Managed? With a Familiar, Yet New, Platform AI agent capabilities extend to advanced reasoning, planning, tool usage, maintaining memory, and self-reflection. These attributes equip AI applications with a fundamentally new set of capabilities, reshaping the way we construct and interact with AI systems. Establishing the AI Agents Management Platform within the solution architecture empowers software developers to become managers of the agents they control— selecting the best LLM for the task at hand, managing token usage and budget among deployable agents, choosing personas and fine-tuning prompts, and managing the data pipelines that provide the agents with contextual memory to better solve the problem they’re presented with. The AI Agents Management Platform (see Figure 2) is becoming an integral part of the LLM application architecture. As these frameworks mature and reliability is enhanced, there will be a significant shift in the AI landscape that allows the developer to be a manager instead of a worker. The AI Agents Management Platform serves as a control center, providing a unified interface for businesses to monitor and manage their AI agents. It encompasses several key areas: • Agent Specialization. AI agents can be specialized to handle specific tasks. The management platform allows businesses to assign and manage these specializations, ensuring that each agent is used in areas where it can deliver the most value. • Regression and Drift Management. AI agents, like any machine learning models, can suffer from model drift over time. The management platform provides tools to monitor for drift and implement corrective measures when necessary. It also allows for regression testing to ensure that updates or changes to the AI agents do not negatively impact their performance. • Resource Management. AI agents consume resources, such as tokens, during their operations. The

Figure 2: Approach to managing and developing with AI agents This framework can be used when architecting an AI Agent application and provides the foundation for generative AI implementation and a holistic approach across people, processes, and technology. The approach outlines how the Management Platform connects with and controls AI Agents, the Memory Layer, and the Thought Engine that powers the application.

Management Platform

1

LLM APIs and Cloud APIs (Proprietary, open, cloud)

Tool APIs and Plugins (Agent and tool configuration)

LLM Ops (Logging and metric tracking)

User Input (Business)

Data Pipelines

ACTIONS AND STEPS, EXPLAINED User Input (Developer) : User makes a request through the AI agent application to generate an output (e.g., code). Through the management dashboard, the business user manages all agent functionality to control the permissions within the application, LLMs being used, tools that are integrated, and data that can be accessed. A business leader’s actions happen in parallel to the work being done by the AI agents. Prompt Management : Improve contextual understanding by analyzing the conversation, user’s chat history, role, and stored data. API Call to LLM : Send user’s input and contextual information to the LLM. LLM Orchestration : The agent communicates with itself, other agents, applications that it has access to, and long-term memory which adds contextual information that the agent uses to plan and implement a solution. Refinement : If needed, refine the response by sending it back to the LLM with specific instructions for solution improvement. Data Storage : Save the final response and LLM output in the appropriate data resources. Agent Response : The AI agent sends the output to the user, which can be in the form of text, code, or an action made within an integrated tool. The AI agent also outputs information to the management platform by logging relevant data and metrics.

Management Platform

At the core of Al agent operations lies the management layer, empowering both business and technical leaders with comprehensive control. This layer equips users with a robust set of tools to effectively oversee Al agent activities and user interactions. Key features encompass data management, integrations management, and model administration. In addition, it offers user permission controls, resource cost analysis, and API key management to ensure optimal utilization.

1

AI Agents

2

Prompt Management

1

System Prompt

User Input (Developer)

Dynamic Prompt

User Query

3

API Call to LLM

AI Agents

5

REFINEMENT

The Al agents layer serves as the bridge between users’ requests and meaningful outputs. Users input their requirements, such as tasks like developing Python scripts, resolving bugs, or translating code, or even submitting items like GitHub pull requests. These inputs are dynamically transformed into prompts that direct the Al agents’ activities. The agents collaborate not only among themselves but also with integrated tools, generating valuable outputs for users. This layer also records comprehensive data, facilitating

LLM

2

7

AGENT RESPONSE

4

LLM Orchestration

3

Agent Communication

Tool Integration

4

subsequent evaluation through the intuitive management dashboard.

OUTPUT

Memory

Enhancing the Al agents’ proficiency is the memory layer, a repository of contextual information specific to tasks and enterprises. This layer grants Al agents access to an array of data stores ranging from structured to unstructured and vectorized formats. By augmenting the agents’ knowledge with mission-critical insights, the memory layer significantly improves their overall performance. This contextual supplementation enables Al agents to deliver outputs that align precisely with the user’s requirements.

5

6

LEGEND

DATA STORAGE

1

Actions Occur in Parallel

6

Data flow to and from LLM

Memory

User Interactions

7

SHORT TERM

LONG TERM

Agent Workflow

Thought Chain History

Local Document Storage

Application Management

Vector Database

Data Lake

Thought Engine

Optional Agent Action

LLM Cache

The hosting of the Al agents application happens within the thought engine. This is a combination of cloud resources (multi- cloud compatible), local compute, and in-application processing.

Thought Engine

76

77

VELOCITY | © 2023 BOOZ ALLEN HAMILTON

Powered by