Enterprise Gen-AI Series: Why Vector Databases are Hot!

By Paul Shearer, Solution Architecture, VP

Vector Databases are hot! Outside of large language models (LLMs), vector databases are swiftly becoming one of the most exciting areas for investment by private equity. In this article, we will dig into why this is the key tech making LLMs useful to the Enterprise and something C-suite needs to have a high-level understanding of.

Traditional OLTP Databases vs. Vector Databases

In the world of ERP, we get databases. Even functional end-users have a good working concept of table structure. For instance, think about a sales order in your ERP system. One table contains the sales order header record, and another table has the sales order detail. These tables are tied together by a key value, most likely a sales order number. All the data, products, SKUs, and quantities are neatly stored within specific columns in the tables. Even our questions (queries) to the database are well structured. “Give me all the rows where sales order promised date is greater than 9/01/2024.”

What our beloved OLTP databases don’t do well (at least for the moment) is handling large volumes of unstructured data such as text, images, or audio. Think about a query where I want to say, “Give me the 5 most similar items to ‘this string of text.’”  A vector database matches based on the semantic meaning, not just some keyword. To accomplish this nifty trick, vector databases employ a fundamentally different architecture.

Note: Oracle is adding vector database search into Oracle Database 23c although specifics around Editions and price are still vague. Also, Vector DB functionality is not available in SQL Server.

The Concept of Conceptual Similarity Search

In a vector database, the focus shifts from those well-structured queries to conceptual similarity searches. This involves identifying database items that are most like the input query, not in literal terms but in conceptual similarity. For example, a query for “The Red Delicious apple was a crime against humanity.” Are we talking about Apple the company, Apple the computer, or the fruit? A traditional database might return rows for all the above. A vector database understands that “Red Delicious” was a type of nasty apple that was in every supermarket in the 80s and 90s.

AI-Based Encoding Models: The Backbone of Vector Databases

The power of vector databases lies in AI-based encoding models. These models are typically pre-trained on vast amounts of text data and can encode semantic meanings into a high-dimensional vector space. Each word or phrase is transformed into a vector, based on its semantic location in an n-dimensional space—commonly involving hundreds, and in some cases, thousands of dimensions. (If you’re a sci-fi nerd, think of the multi-verse from Marvel, the recent movie Everywhere, Everything, All at Once, or the Mirror Universe from Star Trek.)

Example: The Multiple Meanings of "Apple"

Let’s go back to our "apple" example. If the training data is evenly split between references to the fruit, the technology company, the classic computer the resulting vector for "apple" would be triangulated between clusters representing each concept in the vector space. However, the context in which "apple" appears (e.g., words or phrases in the surrounding text) can shift this vector closer to either the fruit or technology cluster, thus capturing the intended meaning more accurately.

Creating Embeddings: Storing Data in Vector Databases

In a vector database, the process of transforming text into vectors is known as creating embeddings. Each piece of text is converted into a series of numerical values that collectively locate a point in the vector space. This point, or vector, may be stored alongside metadata such as links to the original documents, creating a rich, searchable database layer.

Application in Large Language Models (LLMs)

Vector databases play a crucial role in enhancing the capabilities of large language models (LLMs). By converting system prompts into vectors, LLMs can perform similarity searches to find the closest matches in the vector space. These matches point to documents that provide the LLM with additional information about the prompt that can be used to supplement the model's responses, providing additional, up-to-date information. This method helps ground the model's output in verified data, reducing the likelihood of generating incorrect or hallucinated content. This process is known as Retrieval-Augmented Generation (RAG).

How RAG Works

Retrieval-Augmented Generation (RAG) combines language models' broad knowledge with precise information retrieval for nuanced answers. For example, when a pharma researcher asks about CRISPR, RAG references a vector database of scientific work to enhance the LLM's basic understanding with current findings. It's like a student using a cheat sheet during a test, but the sheet is dynamically compiled from the latest studies to address the specific question asked. This fusion of LLMs with real-time data empowers RAG to deliver expert-level, updated responses to specialized queries.

Example:

Below is the LLAMA-3 8B Model when asked who Denovo is:

A: “Denovo is a company that specializes in creating AI-driven solutions for industries like healthcare, life sciences, and financial services. They use cutting-edge technology to develop innovative products and services that help organizations streamline their operations and enhance decision-making processes.”

Actually, I kind of like that answer even though it's completely wrong.

Here is the same prompt but using RAG with the contents of the Denovo website embedded into the vector database.

A: “Denovo is an Oracle Applications, Infrastructure, and Services partner that helps customers design, implement, and maintain ERP-centric solutions. They specialize in providing industry-specific expertise to drive optimal business performance for organizations across various industries, such as manufacturing, distribution, consumer products, and more. Denovo offers tested methodologies, proprietary technology, and hands-on experience to help customers solve their most complex business challenges.”

Conclusion

Vector databases represent a significant advancement in managing and retrieving unstructured data. By leveraging AI-driven embeddings, these databases allow for nuanced, context-aware searches that traditional databases cannot support. As AI technology evolves, the use of vector databases is likely to become more prevalent, particularly in applications requiring a deep understanding of complex, varied data inputs.

Want to see your question answered in the series, or just want to subscribe for alerts on future issues? Simply fill out the form below!