IBM has released a substantial update to its Granite family of large language models (LLMs), which targets several significant market concerns surrounding trust, transparency, and cost efficiency.

Omdia view

Summary

IBM has released a substantial update to its Granite family of large language models (LLMs) that target several significant market concerns surrounding trust, transparency, and cost efficiency. In this report, Omdia examines IBM’s newest language generation and guardrail models, which were released at its annual TechXchange event in Las Vegas, Nevada, on October 21–24, 2024, to see how well they align with current enterprise demands.

Why this matters

The artificial intelligence (AI) hype cycle continues to churn in late 2024, with many major AI model makers hinting at or outright promoting the looming but always over-the-horizon arrival of advanced generative AI (GenAI) capabilities like real-time conversations, multi-modal generation, automated semantic search capabilities, and even advanced reasoning. While this kind of noise may continue to move the needle with executive boards, investors, and market speculators, enterprises looking to actually build GenAI solutions want something more—or less, as it were.

As Omdia learned earlier this year in surveying more than 350 early enterprise GenAI adopters, companies are building with and on top of GenAI not to break but to achieve some sort of breakthrough in artificial general intelligence (AGI). Rather, businesses of all sizesacross all regions and markets simply want to do business better than before. That means capturing efficiency gains, enhancing customer experience, and, of course, saving money (see Figure 1).

Figure 1: GenAI priorities among enterprise practitioners Figure 1: GenAI priorities among enterprise practitioners Source: Omdia

Achieving such goals necessitates focusing on some of GenAI’s more pragmatic facets: accuracy, repeatability, performance, security, governance, and a host of other responsible AI requirements. With more than a year of experience as a maker of GenAI LLMs, IBM appears laser-focused on tackling responsible AI, a philosophy on display in the company’s new set of Granite 3.0 model family announcements made at its annual TechXchange conference.

The past and present of IBM Granite

To illustrate, consider the market’s long-standing infatuation with LLMs sporting ever-larger model parameter counts. When IBM introduced its family of Granite LLMs in 2023, the company didn’t try to match frontier model makers OpenAI and Anthropic in delivering models that could only host on the largest hyperscale cloud platform. Instead, IBM rolled out Granite 13b.instruct and Granite.13b.chat, two models capable of each running within a single V100 32GB GPU.

Similarly, IBM focused on pre-training these models using seven terabytes of data tailored, a portion of which was specialized business-domain tasks, such as classification, summarization, and question answering. At the time, the idea was to build models well-suited to actual business use cases by incorporating semantic meaning from business domains, including academic, legal, finance, and software development. Fast forward to October 2024 and the release of the Granite 3.0 family of models, and it is easy to see a company staying true to its original intent and growing these core capabilities to address both opportunities and challenges facing today’s enterprise practitioner.

To begin, IBM has introduced several new models, adding to its portfolio that currently spans chat, instruction, code, geospatial, time-series, and guardrail models. With the 3.0 release, IBM Granite now includes the following groupings:

  • Several general-purpose models werebuilt to tackle basics like classification, entity extraction, tool use, and retrieval augmented generation (RAG):
    • Granite 3.0 8B Instruct
    • Granite 3.0 2B Instruct
    • Granite 3.0 8B Base
    • Granite 3.0 2B Base
  • Responsible AI-oriented safety and guardrail models to check user prompts and model outputs for any potential harm:
    • Granite Guardian 3.0 8B
    • Granite Guardian 3.0 2B
  • A new mixture of anexpert set of lightweight models capable of supporting low-latency workflows using CPUs:
    • Granite 3.0 3B A800M Instruct
    • Granite 3.0 1B A400M Instruct
    • Granite 3.0 3B A800M Base
    • Granite 3.0 1B A400M Base

As with the company’s first foray into LLMs, these new models are what Omdia would term “right-sized” regarding parameter count. In delivering 1B through 8B sizes, IBM has actually downsized its models, keeping pace with an important trend Omdia has noted among model makers where quality pre-training data and well-crafted model training and alignment techniques take precedence over parameters.

As seen with smaller models like the Microsoft Phi, extensive training on high-quality training data can go a long way, allowing smaller models to take on tasks typically associated with such large models. But there’s a lot more to these smaller models that make them well-suited to enterprise use cases. This new family of Granite 3.0 models focuses on several critical model capabilities:

Transparency in model training. Though IBM doesn’t make its pre-training data available, it uniquely provides full transparency into its two-phase training method and data sources. These models were trained on more than 12 trillion tokens using datasets spanning 12 different human languages and 116 different programming languages.

Open source release. As before, all Granite models are available under the highly permissive and well-regarded Apache 2.0 license. This allows IBM to stand apart from “pseudo” community-licensed models that only masquerade as open source. IBM’s commitment will attract a solid ecosystem of developers and allow for scrutiny and community contributions, which will go a long way in delivering safe, trusted GenAI outcomes.

Focus on enterprise tasks. Granite models are built to excel on tasks critical to businesses, ensuring they are relevant for real-world applications and can be used responsibly and professionally. Perhaps the best example is IBM’s time-series models that can work with structured data.

Safety and guardrails. IBM was very early to tackle responsible AI with several bias detection dataset tools back in 2015 (IBM AIF360). Continuing these efforts and matching stride with competitors like Meta with Llama Guard, IBM has developed Granite Guardian 3.0, a set of companion models designed to enhance the safety of not just Granite language models but all models capable of running on the company’s watsonx.ai platform.

Granite Guardian is trained to detect various risks, including jailbreaking attempts, bias, violence, profanity, sexual content, and unethical behavior, all of which aim to prevent models from generating harmful outputs.

Another key facet of Granite Guardian is its ability to identify and flag instances where an LLM might generate incorrect or fabricated information, particularly in RAG workflows, going so far as to tackle groundedness, context relevance, and answer relevance.

Accessibility and developer enablement. Lastly, IBM provides various resources to facilitate responsible development and use of the Granite models. These resources include model playgrounds, documentation, “how to” recipes, and prompt engineering guides. The idea is to make it easier for developers to understand and use the models effectively and responsibly.

This is key for IBM, given that the company does not enjoy the same degree of exposure within the broader AI and app development community as its rivals Microsoft, Amazon Web Services (AWS), and Google. New and forthcoming developer-friendly tools like agentic frameworks, chat-based workflow automation via IBM watsonx Orchestration, low-/no-code tooling, and extensive app/data integrations on watsonxwillhelp the vendor in this regard.

Analysis and outlook

An interesting facet of this release is that IBM will make a subset of the Granite 3.0 family of models available on competitor platforms, namely NVIDIA’s NIM and Google Vertex AI. Granite has already been integrated within several ecosystems, such as AWS, Docker, Salesforce, Hugging Face, GitHub, and SAP.

These integrations will certainly help IBM remain relevant within a very noisy model marketplace by enabling practitioners to experiment with Granite models alongside competitive “right-sized” offerings from Meta (Llama), AWS (Titan), and Google (Gemini). Another helpful approach IBM is taking with this release is to make its instruction-tuned models available on Ollama and Replicate.ai.

Support for Ollama will allow IBM to reach millions of enthusiasts looking to experiment with LLMs on their local Windows, Linux, and MacOS machines. Similarly, working with LLM inference vendors like Replicate.ai will help IBM to get its models into the hands of enterprise practitioners looking not only to run but also fine-tune LLMs to support very specific use cases and domains of knowledge.

IBM can certainly and should do more to elevate the profile of its Granite family of models. Doing so will be necessary if IBM hopes to shift customer attention from Llama, Titan, Gemma, and many other rival models. Omdia recommends that IBM build upon the following opportunities to accomplish this goal.

First, IBM should trumpet its commitment to transparency in adopting the Apache 2.0 license. This sets the company apart from many rivals that only offer community licenses couched as open source. Such community licenses provide no protection from exposure to risk over the long-term, ultimately undermining and devaluing the concept of free and open source software (FOSS). Conversely, IBM’s commitment to FOSS and its emphasis on transparent data will foster trust and encourage community engagement, which could lead to model further innovation and wider adoption.

Second, IBM should draw attention to its singular focus on enterprise requirements. Its tailoring of models for specific business tasks, including specialized models for time series and geographic data, demonstrates a deep understanding of enterprise needs. Further, focusing on practical applications could give IBM an edge in the business AI market, where the company has excelled over several decades.

Third, IBM should expand its suite of responsible AI tools. The introduction of Granite Guardian for safety and bias detection shows IBM’s proactive stance on AI ethics and governance. It alsoaligns well with growing regulatory and public concerns about AI safety. But there’s more for the company to do, especially in addressing the lurking threat of AI security and privacy.

Lastly, IBM should double down on expanding its developer ecosystem through tools and support programs. In particular, the company should work more closely with its subsidiary, Red Hat, to encourage local LLM exploration and development using Red Hat Enterprise Linux AI, InstructLab, and Podman AI Lab. Elevating these tools will help the company attract more developers and build a strong partner ecosystem around Granite models. This will also help attract enterprise buyers to its broader suite of tools, such as IBM Cloud Pak for Data and the watsonx suite of AI tools.

Appendix

Further reading

“Looking past the AI branding hype and toward increasingly unified GenAI platforms” (August 2024)

Generative AI Enterprise Survey: Early Adopters – 2024 (May 2024)

Author

Bradley Shimmin, Chief Analyst, AI Platforms

[email protected]