May 9, 2023 By David D. Cox 5 min read

We stand on the frontier of an AI revolution. Over the past decade, deep learning arose from a seismic collision of data availability and sheer compute power, enabling a host of impressive AI capabilities. But we’ve faced a paradoxical challenge: automation is labor intensive. It sounds like a joke, but it’s not, as anyone who has tried to solve business problems with AI may know.  

Traditional AI tools, while powerful, can be expensive, time-consuming, and difficult to use. Data must be laboriously collected, curated, and labeled with task-specific annotations to train AI models. Building a model requires specialized, hard-to-find skills — and each new task requires repeating the process. As a result, businesses have focused mainly on automating tasks with abundant data and high business value, leaving everything else on the table. But this is starting to change. 

The emergence of transformers and self-supervised learning methods has allowed us to tap into vast quantities of unlabeled data, paving the way for large pre-trained models, sometimes called “foundation models.” These large models have lowered the cost and labor involved in automation.  

Foundation models provide a powerful and versatile foundation for a variety of AI applications. We can use foundation models to quickly perform tasks with limited annotated data and minimal effort; in some cases, we need only to describe the task at hand to coax the model into solving it.  

But these powerful technologies also introduce new risks and challenges for enterprises. Many of today’s models are trained on datasets of unknown quality and provenance, leading to offensive, biased, or factually incorrect responses. The largest models are expensive, energy-intensive to train and run, and complex to deploy. 

We at IBM have been developing an approach that addresses core challenges for using foundation models for enterprise. Today, we announced watsonx.ai, IBM’s gateway to the latest AI tools and technologies on the market today. In a testament to how fast the field is moving, some tools are just weeks old, and we are adding new ones as I write.  

What’s included in watsonx.ai — part of IBM’s larger watsonx offerings announced this week — is varied, and will continue to evolve, but our overarching promise is the same: to provide safe, enterprise-ready automation products. 

It’s part of our ongoing work at IBM to accelerate our customers’ journey to derive value from this new paradigm in AI. Here, I’ll describe our work to build a suite of enterprise-grade, IBM-trained foundation models, including our approach to data and model architectures. I’ll also outline our new platform and tooling that enables enterprises to build and deploy foundation model-based solutions using a wide catalog of open-source models, in addition to our own. 

Data: the foundation of your foundation model  

Data quality matters. An AI model trained on biased or toxic data will naturally tend to produce biased or toxic outputs. This problem is compounded in the era of foundation models, where the data used to train models typically comes from many sources and is so abundant that no human being could reasonably comb through it all. 

Since data is the fuel that drives foundation models, we at IBM have focused on meticulously curating everything that goes into our models. We have developed AI tools to aggressively filter our data for hate and profanity, licensing restrictions, and bias. When objectionable data is identified, we remove it, retrain the model, and repeat. 

Data curation is a task that’s never truly finished. We continue to develop and refine new methods to improve data quality and controls, to meet an evolving set of legal and regulatory requirements. We have built an end-to-end framework to track the raw data that’s been cleaned, the methods that were used, and the models that each datapoint has touched.  

We continue to gather high-quality data to help tackle some of the most pressing business challenges across a range of domains like finance, law, cybersecurity, and sustainability.  We are currently targeting more than 1 terabyte of curated text for training our foundation models, while adding curated software code, satellite data, and IT network event data and logs.  

IBM Research is also developing techniques to infuse trust throughout the foundation model lifecycle, to mitigate bias and improve model safety. Our work in this area includes FairIJ, which identifies biased data points in data used to tune a model, so that they can be edited out. Other methods, like fairness reprogramming, allow us to mitigate biases in a model even after it has been trained. 

Efficient foundation models focused on enterprise value 

IBM’s new watsonx.ai studio offers a suite of foundation models aimed at delivering enterprise value. They’ve been incorporated into a range of IBM products that will be made available to IBM customers in the coming months. 

Recognizing that one size doesn’t fit all, we’re building a family of language and code foundation models of different sizes and architectures. Each model family has a geology-themed code name —Granite, Sandstone, Obsidian, and Slate — which brings together cutting-edge innovations from IBM Research and the open research community. Each model can be customized for a range of enterprise tasks. 

Our Granite models are based on a decoder-only, GPT-like architecture for generative tasks. Sandstone models use an encoder-decoder architecture and are well suited for fine-tuning on specific tasks, interchangeable with Google’s popular T5 models. Obsidian models utilize a new modular architecture developed by IBM Research, providing high inference efficiency and levels of performance across a variety of tasks. Slate refers to a family of encoder-only (RoBERTa-based) models, which while not generative, are fast and effective for many enterprise NLP tasks. All watsonx.ai models are trained on IBM’s curated, enterprise-focused data lake, on our custom-designed cloud-native AI supercomputer, Vela

Efficiency and sustainability are core design principles for watsonx.ai. At IBM Research, we’ve invented new technologies for efficient model training, including our “LiGO” algorithm that recycles small models and “grows” them into larger ones. This method can save from 40% to 70% of the time, cost, and carbon output required to train a model. To improve inference speeds, we’re leveraging our deep expertise in quantization, or shrinking models from 32-point floating point arithmetic to much smaller integer bit formats. Reducing AI model precision brings huge efficiency benefits without sacrificing accuracy. We hope to soon run these compressed models on our AI-optimized chip, the IBM AIU

Hybrid cloud tools for foundation models 

The final piece of the foundation model puzzle is creating an easy-to-use software platform for tuning and deploying models. IBM’s hybrid, cloud-native inference stack, built on RedHat OpenShift, has been optimized for training and serving foundation models. Enterprises can leverage OpenShift’s flexibility to run models from anywhere, including on-premises. 

We’ve created a suite of tools in watsonx.ai that provide customers with a user-friendly user interface and developer-friendly libraries for building foundation model-based solutions. Our Prompt Lab enables users to rapidly perform AI tasks with just a few labeled examples. The Tuning Studio enables rapid and robust model customization using your own data, based on state-of-the-art efficient fine-tuning techniques developed by IBM Research

In addition to IBM’s own models, watsonx.ai provides seamless access to a broad catalog of open-source models for enterprises to experiment with and quickly iterate on. In a new partnership with Hugging Face, IBM will offer thousands of open-source Hugging Face foundation models, datasets, and libraries in watsonx.ai. Hugging Face, in turn, will offer all of IBM’s proprietary and open-access models and tools on watsonx.ai.  

To try out a new model simply select it from a drop-down menu. You can learn more about the studio here.

Looking to the future 

Foundation models are changing the landscape of AI, and progress in recent years has only been accelerating. We at IBM are excited to help chart the frontiers of this rapidly evolving field and translate innovation into real enterprise value. 

Learn more about watsonx.ai
Was this article helpful?
YesNo

More from Artificial intelligence

The “hidden figures” of AI: Women shaping a new era of ethical innovation

3 min read - The end of March marks the conclusion of Women’s History Month. And despite the increased focus on women’s issues and contributions to society throughout the month, the conversation would be incomplete without recognizing how indispensable the success of women—past and present—has been in the tech industry. In particular, women are leading the way every day toward a new era of unprecedented global innovation in the field of generative AI. However, a New York Times piece that came out a few months…

Celebrating the women of IBM AI Ethics

3 min read - For more than 100 years, IBM’s founding principles have inspired efforts to promote equality, fairness and inclusion in the workplace and society. The company has lived the value of “respect for the individual” by championing employment practices that reward ability over identity and that make work more attainable for all. In 1935, approximately twenty years after IBM was founded, it began hiring women into professional roles. Three decades before the US Equal Pay Act of 1963, IBM’s CEO, Thomas J.…

Ahead of the curve: How generative AI is revolutionizing the content supply chain

6 min read - The global adoption of generative AI is upon us, and it’s essential for marketing organizations to understand and play in this space to stay competitive. With content demands expected to grow in the next few years, organizations need to create more content at a faster pace to meet customer expectations and business needs. Knowing how to manifest these improvements is not always clear: Enter generative AI and the content supply chain. [button link="https://ibm.co/content-supply-chain"]Get the study: The Revolutionary Content Supply Chain[/button]…

IBM Newsletters

Get our newsletters and topic updates that deliver the latest thought leadership and insights on emerging trends.
Subscribe now More newsletters