Omdia view
Summary
The enterprise market for generative AI has begun to take flight over the past six weeks with both mainstream and surprise entrants rolling out new large language models (LLMs). Only a few vendors, however, have focused on the operationalization of those models from an IT perspective. Omdia explores one recent entry from AWS, Amazon Bedrock.
Rapid-fire innovation
Sometimes market-changing events come out of nowhere and hit you square in the jaw, just as with the surprise introduction of OpenAI’s ChatGPT in November 2022. Delivering a solid right hook, powered by GPT-3, ChatGPT upended our expectation of what a chatbot and generative AI (GAI) itself can do.
Since then, the market disruptions have continued apace, first when ChatGPT featuring GPT-4 drove many researchers to ask if we humans had unbeknownst to ourselves created a nascent form of artificial general intelligence (AGI). And then, in response to this ruckus, many competitors and contrarians quickly called for a temporary pause to the development of foundational models (FMs), principally LLMs like GPT, Bloom, NeMo, and LaMDA.
It appears there will be no ban, at least not in the US. However, if Italy is any sort of bellwether, the European Union (EU) is likely to proceed with some degree of caution with restrictions already spelled out within the EU AI Act for high-risk AI outcomes.
The concerns spelled out in the EU AI Act are not without merit and touch on issues that are specific to moving LLMs into the enterprise, namely maintaining privacy of data used to train and fine-tune models, respecting the ownership of external data, assigning accountability when things go wrong, and, most importantly, ensuring the content generated by these models is factual, unbiased, and attributable ‒ all issues that have yet to be addressed in any kind of tangible manner.
And yet, within the span of just one month, the industry has seen the rapid-fire introduction of several LLM-specific machine learning (ML) platform capabilities from Salesforce, NVIDIA, Google, Microsoft, Databricks, and, very recently, AWS. Clearly, the technology provider marketplace believes that GAI will be as much of a hit in the enterprise as it has been within the consumer marketplace ‒ so long as enterprise worries over bias, accuracy, transparency, privacy, security, performance, and a raft of other responsible AI requirements are addressed through the tooling used to build and deploy these new offerings.
(For more on Google and Salesforce in particular, see Omdia’s analyst opinion, “And just like that, generative AI reaches the enterprise marketplace”).
LLM development done right
It may seem that this new cadre of LLM-capable ML development platforms has appeared out of seemingly nothingness, but these platforms have been in the works for some time, building on years of well-established operational practices. Most major ML platforms have already delivered support for FMs, especially for models belonging to the platform vendor itself, as with Google’s Universal Speech Model, which runs on Google Vertex AI.
Even with well-established platforms like Microsoft Azure ML and Google Vertex AI, customers can easily stumble into undue risk in pushing LLMs behind the corporate firewall. As with many emerging technologies, the risk depends on the path taken. If a customer wants to use Google BERT for basic text classification, the risk is relatively low. If a customer wants to stand up GPT4 and feed it live, corporate data for general question and answer (Q&A), then the risk is much, much higher.
There are still too many as yet unanswered questions surrounding how companies can customize LLMs to safely tackle domain-specific functions. This is particularly true when it comes to establishing a solid underlying architecture capable of operationalizing LLM development, deployment, and maintenance at scale. This is why we are currently seeing so many LLM-capable ML platform announcements. Technology providers recognize the importance of developing a solid foundation that minimizes such risks through operational best practices.
This foundational viewpoint is evident in the set of GAI announcements and promotions made by AWS in early April 2023, which span hardware, platform, services, and models. A short rundown of those announcements follows.
Amazon EC2 Trn1 Instances and Amazon EC2 Inf2 Instances
AWS is promoting its rapidly evolving AI acceleration hardware portfolio with iterations that can better accommodate both training and inference requirements of LLMs.
Amazon CodeWhisperer
AWS announced the general availability of its code completion and suggestion assistant, CodeWhisperer, featuring a free individual tier.
Amazon Bedrock
Bedrock is a new platform designed to help companies select, build, and deploy FMs from both Amazon and a collection of GAI-specific partners.
It is clear from this release that AWS intends to stand toe to toe with the other global platform players in supporting enterprise-grade GAI through both internal and partner-led research and development. From that vantage point, AWS and its hyper-scale rivals have their eye on solving a couple of nagging IT issues, specific to GAI in the enterprise:
- Rapidly moving market. The GAI market is moving so rapidly that it is almost a waste of time for developers to learn version-by-version API calls. They need a platform that can help future-proof those investments through techniques like project-by-project resource virtualization.
- Rapidly expanding market. Not only is the GAI market moving fast but it is also exploding with tier one entrants like NVIDIA NeMO, BigScience Bloom, and Google LaMDA. The same holds true for the low end, particularly after the leak of weights and biases for Meta’s LLaMa model sparked the sudden rise of smaller, mostly open-source rivals to GPT such as Alpaca, GPT-J 6B, and GPT4All. What IT practitioners need is a central, governed repository that helps them locate and make use of the most appropriate tool (model) for the job at hand.
- Sizable infrastructure costs. As many early adopters of OpenAI’s API have discovered, hosted LLM costs can get out of hand rather quickly unless careful attention is paid to the entire software and hardware value chain: beginning with prompt engineering and ending with inferencing hardware utilization.
- Data exposure and IP leakage. It is incredibly difficult to build a domain-specific LLM without risking the exposure of corporate data. Beyond the usual ML issues of privacy, bias, etc., there is a new raft of dangers emerging with LLMs revolving around instruction-following hacks and prompt engineering weaknesses that must be addressed on a running basis.
- Transparency and accountability. Already enterprises struggle with deep learning (DL) transparency, explainability, and auditability issues. With LLMs, these issues have taken on an entirely new and somewhat terrifying dimension, given that the researchers building these LLMs more often than not cannot explain the “why” of model capabilities and output. As with other GAI issues, the models themselves are proving to be a solid ally in at least helping users understand where they are getting their information or when they are hallucinating. However, much, much more is needed in the way of model monitoring and oversight for companies to truly consider LLMs an auditable and compliant technology.
Thankfully, we are seeing a sizable push from technology providers to address these concerns led by ML platforms that are at least capable of turning LLMs into enterprise-grade consumable APIs. Huggingface, for example, has been chasing this cause for some time and with great success. Note that many of the LLMs mentioned in this article are available for consumption via Huggingface. And recently, Huggingface introduced HuggingGPT, a service capable of connecting numerous AI models to solve complicated tasks ‒ cue a new round of AGI speculation.
Digging down to the bedrock
With this collection of announcements, AWS has set itself on the same path toward operationalizing GAI technologies in the enterprise, particularly in its approach with Amazon Bedrock. Available as a limited preview, this solution promises to equip companies with a single API management platform from which they can access a broad swath of LLMs from Amazon, including at launch Amazon Titan FMs, which includes Titan Text, a generative AI model for text summarization, Q&A, information extraction, content generation, and classification. A second model, Titan Embeddings, allows users to extract the semantic meaning from text.
Omdia expects the concept of Titan Embeddings to play a crucial role in how the GAI market evolves, driving new technologies such as vector databases and, when combined with encryption, enabling new ways of incorporating contextual meaning from corporate data within LLMs in a more secure and private manner. Titan Embeddings will change the way businesses think about basic functions like sentiment analysis, named entity extraction, and especially Q&A using natural language.
Importantly, AWS has built Amazon Bedrock to look outside as well as within, as the platform will provide the same kind of support for GAI FMs built by several external model creators:
- AI21 Labs. The Jurassic-2 family of LLMs, which target text generation across numerous languages
- Anthropic. Claude, an LLM focused on conversations, Q&A, as well as workflow automation
- Stability AI. Stable Diffusion text-to-image model for the generation of images, art, logos and designs.
There is plenty of room to grow here, and Omdia hopes to see similar partnerships emerge with early and important players such as Cohere, Meta, EleutherAI, Huggingface, NVIDIA, and BigScience. Support for rivals DeepMind (Google) and OpenAI (Microsoft) will eventually become a necessary extension as well.
In particular, Omdia would like to see a company like AWS tackle the burgeoning technology market for smaller, semi-open source but not yet commercial models built on Meta’s LLaMa LLM technology. Early models of note here include Alpaca, GPT4All, Koala, and Vicuna. There are even some commercial products emerging from this early research, most notably Dolly from DataBricks, which is arguably the first fully open-source LLM (model training code, weights, and training data).
This active ecosystem has greatly accelerated the creation not just of LLMs but also tools that both support and extend the use of LLMs in the enterprise, such as:
- SkyPilot. A framework that makes it easy for developers to run ML workloads on any cloud via a single pane of glass
- LangChain. An inventive framework that enables developers to combine LLMs, APIs, and data in new and inventive ways.
The technology changes, but the song remains the same
Not yet available to the general populace but open for private testing, Amazon Bedrock will undoubtedly expand to encompass numerous third-party technologies. In looking just at the platform itself, things look very mature. For example, out of the gate, Amazon Bedrock users can take full advantage of the many economies of scale native to AWS. Those include training and inference acceleration mentioned earlier, as well as full serverless functionality for all processes running on Amazon Bedrock itself.
More important than speed is the idea of operationalizing LLMs at scale. And that is what Amazon Bedrock sets out to do from the get-go by helping companies select the best model from among these disparate sources and access that model via an API. This enables enterprise practitioners to combine LLMs easily into new and existing ML pipelines built in and around AWS’s sizable Amazon SageMaker family of ML tools.
Here is a scenario outlining the way Amazon Bedrock will work. A customer wants to build a text translation service from English into Spanish. The business analyst, working together with a data scientist, may browse Amazon Bedrock’s library of models, looking for a match. Once found (the Jurassic-2 model from AI21 Labs, for instance), the practitioners can then instantiate that model, add in their own corporate data (in accordance with corporate privacy and security standards), test the model for accuracy, bias, inclusiveness, etc., customize model functionality, and then roll that model out, all using the same set of Amazon SageMaker technologies.
Is Amazon Bedrock doing LLMops on its own? Not all on its own. It serves as an important component of a broader LLMOps play ‒ a component that builds on the company’s rich history of MLOps practices and technologies. Put another way, just as with Amazon SageMaker, the introduction of Amazon Bedrock will help to extract the complexities of managing the underlying infrastructure necessary to build, run, and govern complex implementations that span disparate vendor technologies.
That is ultimately AWS’s biggest value proposition: to open and simplify the task of building AI-powered solutions in the enterprise at any scale, whether those solutions are simple predictive models, advanced GAI solutions, or both working together. That is how you future-proof fast-moving market innovations.
Appendix
Further reading
Generative AI: Tech Provider Viewpoints (March 2023)
Generative AI: Market Landscape 2023 (March 2023)
“And just like that, generative AI reaches the enterprise marketplace” (March 2023)
AWS. Announcing New Tools for Building with Generative AI on AWS (April 13, 2023) https://aws.amazon.com/blogs/machine-learning/announcing-new-tools-for-building-with-generative-ai-on-aws/
Author
Bradley Shimmin, Chief Analyst, AI Platforms, Analytics, and Data Management