AWS brought the heat to an already steamy NYC last month hosting AWS Summit, where the vendor rolled out agent-based tools specific to the challenge of taking generative AI mainstream in the enterprise.

Omdia view

Summary

AWS brought the heat to an already steamy NYC last month by hosting the AWS Summit, where the vendor rolled out several solutions specific to the challenge of taking generative AI mainstream in the enterprise. In this report, Omdia dives into a key announcement from the event that will benefit generative AI in the enterprise.

Why this matters

With the constant conveyor belt of generative AI announcements clamoring for market attention, it is easy to lose track of where a given technology provider stands concerning its peers. Under “normal” circumstances, first movers grasp and often retain the advantage. And yet, with generative AI and large language models (LLMs), such distinctions mean nothing. The LLM ecosystem of technology providers and researchers is locked into discovery mode, where each day, existing products and practices are thrown away in favor of modern technologies, novel approaches, and new capabilities.

An exclusive partnership that looks like a competition-killer can turn into an albatross overnight, holding both partners back. The same goes for the models themselves, as every day, some new models feature new pre-training data sets, model architectures, and fine-tuning innovations. With this chaotic background as a tailwind, fortune favors first movers capable of changing course on a dime to accommodate and capitalize upon new opportunities.

To accomplish this feat, a company must architect with a radical change in mind, creating a solid technological underpinning capable of working unforeseen shifts like LLMs. For the AWS of two years ago, this kind of call to arms would seem completely foreign to a company espousing a considered, consistent pace, rolling out new products in December, and adding best-of-breed solutions on an as-needed basis.

That company is no more. The AWS taking center stage at the AWS Summit in NY at the end of July looked anything but slow and nothing like a vendor playing catch-up with first movers Microsoft and OpenAI. Consider, for example, this broad swath of announcements, each focused on applying generative AI across the portfolio. The company announced new EC2 P5 instances powered by NVIDIA H100 Tensor Core GPUs that promise approximately 80% better performance over earlier renditions. It also announced points of integration between AWS Glue Studio and Amazon CodeWhisperer, which will streamline data integration workflows in support of LLM development while also opening new vistas of self-service analytics for AWS customers. New generative business intelligence (BI) capabilities in Amazon QuickSight will help customers democratize access to analytical insights across the business.

From Omdia's point of view, the most impactful announcement during the event centered on adding agents for Amazon Bedrock, a fully managed platform for creating and deploying LLM-powered applications using autonomous agents capable of connecting with the information and functionality necessary to carry out complex interactions. When AWS announced Amazon Bedrock in April 2023, the company emphasized choice without risk by helping companies select, build, and deploy foundational models (FMs) from not just Amazon (the Titan family of models) but also from a collection of GAI-specific partners including A21 Labs, Anthropic and Stability AI (See Searching for LLMOps in a new generative AI platform from AWS).

Introducing LLM agents

Fast forward to the AWS Summit in NY in late July 2023. AWS has taken a huge step forward with AWS Bedrock, helping developers build generative AI applications that may break down user requests into independent and autonomous tasks. At a basic level, developers can create applications capable of understanding, breaking down, and responding to complex requests.

These agent-based apps accomplish this by breaking down requests into flexible, tool-based tasks. These tasks can involve going back to an LLM for question clarification, performing a semantic search of company information, or making an API call to internal or external backend systems. Further, these tasks can operate in parallel or in sequence and do so with and without inter-dependencies, forming what AWS terms an orchestration plan -- a plan that meets a broader use case, such as a shopping agent that can help customers complete a product return.

Market impact

AWS is not the first company to facilitate the creation of agent-based, autonomous LLM solutions. Driving a great deal of innovation within the generative AI market, LLM orchestration tools like the open-source framework Langchain have helped developers build LLM workflows that stitch together many autonomous tools (web scrapers, Python interpreters, etc.) to accomplish complex, multi-step tasks. For example, developers can use an LLM to define a series of steps necessary to answer a complex problem, passing those steps to an agent orchestrator, which can then bring in programmatic tools like a Python interpreter to carry out one or more of those steps.

Major LLM platform developers themselves are beginning to build such capabilities into their own APIs, as with OpenAI’s recently introduced Functions. With OpenAI Functions, developers can readily integrate external tools (and APIs) into their solutions, passing structured information into and out of LLMs to power these integrations. This capability will figure heavily into OpenAI's prospects as the vendor intends to use these functions to facilitate its still nascent but rapidly growing marketplace for ChatGPT Plugins. When this report was written, there were over 800 third-party plugins in OpenAI's Plugin Store. ChatGPT users can enable plugins they deem both safe and useful. When these users then ask ChatGPT a question, the underlying model platform will figure out which tools are needed to fulfill the user request, starting up and connecting with those services at the time of execution.

Balancing the scales

This kind of autonomous functionality is extremely powerful, allowing users, for example, to easily bring contextual information into ChatGPT and use their own data sets (e.g., summarize and ask questions about PDF files). Conversely, developers can build chatbots capable of delivering personalized, contextual, and, most importantly, up-to-date answers based on dynamic data sources and the LLM’s static training dataset. However, from a developer’s perspective, things are not so simple.

Enterprises seeking to replicate this kind of functionality internally, securely using their information and tooling, will not take kindly to trusting external, third-party services, as with OpenAI's ChatGPT plugin approach. This leaves them having to trade off simplicity for control. Certainly, they can use libraries like Langchain to build their own autonomous agent orchestration architecture. Still, they will have to go through the trouble of wrangling the storage and compute resources required to stand up and power those runtime agent tools.

AWS' new Bedrock agent capabilities seek to retain both simplicity and control. Using Bedrock's visual environment, developers can assemble a secure, governable, and robust agent that can deliver timely answers using proprietary data, all without dealing with the complexities of provisioning and managing underlying resources.

For example, in building an information retrieval agent, developers will not have to directly connect to supporting information sources like OpenSearch, Kendra, or S3 buckets. In the same way, they will not have to directly deal with provisioning any permissions or execution resources for agent functionality. AWS handles those as event-driven serverless resources via AWS Lambda, creating an orchestration plan comprising automatically broken down, supportive tasks.

Future outlook

For companies looking for simplicity that are willing to share their data with third-party tool (plugin) providers and work with just one LLM, autonomous LLM agents like OpenAI ChatGPT functions can deliver a great deal of value. For companies that demand more choice and greater control, tools like Langchain can work wonders, but with choice and control comes a great deal of complexity that can easily turn into technical debt over time.

With the AWS Bedrock agent functionality, AWS offers a better way to balance choice, control, and complexity. Whether using AWS’ internal Titan family of LLMs or opting to work with one of AWS’ LLM partners (Cohere, AI21, et al.), customers can use the same experience to create not only autonomous agents but also other LLM operational (LLMOps) tasks such as model fine-tuning, output monitoring, etc.

On that point, there is still much more work for AWS and the rest of the industry. For AWS Bedrock, what remains is further refinement around fine-tuning models to align more closely with company data and corporate-responsible AI requirements, using advanced techniques such as parameter efficient fine-tuning (PEFT) and reinforcement learning with human feedback (RLHF).

Also, building on its success in creating a public store for software and information (AWS Marketplace and AWS Data Exchange), the company should extend its new AWS Bedrock agent tooling to allow companies to productize agents for reuse and extension, at first internally but eventually externally in support of select use case and vertical ecosystems. AWS hopes to encourage the development of custom, user-defined agents capable of working with information and functionality specific to a given customer.

And most importantly, AWS will need to invest in tools capable of controlling model output. Pre-training may give LLMs a basic skillset; fine-tuning may help to refine those skills within a specific domain or task; and RLHF may improve alignment between LLMs and company policies surrounding responsible AI. But at the end of the day, practitioners can only hope for coherent and consistent model outputs without some means of processing model outputs before sending those outputs back to the consumer or another LLM or agent.

AWS and other providers will need to create supportive tools specific to processing model output, for instance, putting those outputs through a rules engine to ensure no private and personal information (PPI) or sensitive intellectual capital (IP). Or this post-processing layer might include another LLM that could restate the original question and then feed that back into the original model to address such privacy issues or simply correct formatting inconsistencies.

If AWS' recent efforts in extending AWS Bedrock with support for autonomous agents is anything to go by, the vendor is likely to already have such a tool under construction, which can ensure model outputs are coherent, consistent, accurate, appropriate, and topical. Such tooling will serve as a crucial point of differentiation among AI platform providers (see Technology Analysis: Responsible LLM Tools and Practices). More important than that, in creating this kind of LLMOps tooling, AWS, and its peers can help LLMs find a welcome home in the enterprise that goes beyond experimentation to drive real change.

Appendix

Further reading

Searching for LLMOps in a new generative AI platform from AWS (April 2023)

Technology Analysis: Responsible LLM Tools and Practices (August 2023)

Author

Bradley Shimmin, Chief analyst, AI platforms, analytics and data management

askananalyst@omdia.com