If you want to leverage the power of
LLMs often need context or data which is not in their original training set in order to give the right answer to your queries. By making your company data available at the time that an LLM generates its response you can improve their ability to answer more useful, personalised questions.
But how can you do this safely and securely and provide your data in a way that an LLM can use it effectively?
The answer lies in translating your data into a format that can be understood by an LLM. You'll need to store the data accessibly and ensure your application consults this data source before generating answers.
Read on to learn how and why to do this.
First, we need to understand a little more about how LLMs work. You can argue that LLMs and applications that use them such as
LLMs are able to do this because of “embeddings”.
Embeddings are an important aspect of how LLMs “understand” the world.
Embeddings algorithms are able to capture the semantic similarity of data. In other words, they store the meaning of text, images and other data and how they relate to one another. They achieve this by creating numerical representations or “vectors” that can then be used to assess how close or far apart concepts are.
For example, animals and fruits are semantically distinct groupings and so 'cow' and 'pig' are grouped more closely than 'pig' and 'apple'.
This same process can capture the relationship and meaning between bits of more complex, unstructured data. For example, the content of podcasts or user reviews for products.
Embeddings represented in a vector space
This is where the idea of semantic search comes in. Data stored as embeddings can be compared for similarity in “meaning”, so users can search intuitively, in the same way we might ask a real person where to find something – with descriptive phrases rather than by keywords.
This is incredibly powerful because we tend to have an idea of what we’re looking for, but rarely do we know precisely which keywords appear in what we want.
For example, with semantic search we can find documents related to "healthy recipes" even if they don't explicitly contain the keyword "healthy".
Semantic search is how recommendation engines can be so effective. Semantic search allows recommendation engines to statistically determine similarities between items you have already told them you like.
I wrote about how Spotify and are effectively using semantic search in this article on the
Vector search has been big tech’s tightly kept secret for years. When you find Amazon suggesting that product you never knew you wanted, that isn’t magic; it’s vector search.
Okay, so now we understand how LLMs interpret and “understand” the world to effectively predict the output users want from queries.
How do you feed in private data to securely use the power of LLMs in applications for completing tasks for end users?
Use an embeddings algorithm (such as the one offered by ) to create embeddings for your private data and store it in a vector database. Then when a user makes a query, your data sources are consulted and provided to the LLM for use in its reasoning when generating a response. These methods are shown in steps 1 and 2 in the diagram below.
Data storage and retrieval with embeddings and LLMs
This architecture is commonly referred to as “retrieval augmented generation” (RAG).
As outlined in the diagram below, users' queries are directed first to your retrieval system before the LLM so that your data is added into the context that the LLM can use to return an accurate answer.
A simplified representation of augmenting user queries with a retrieval system
You can then write code to automatically update the database so that the LLM has near real-time insight into your company information such as documentation, stock availability or customer history.
These same architectures can also help bypass the memory limits imposed by LLMs by representing and storing information in a more compact and efficient manner.
With vector databases, you can go beyond API calls to GPT and add advanced features to your AI applications.
Innovations like semantic information retrieval and retrieval augmented generation simultaneously increase the accuracy and reliability of responses to queries, whilst circumventing the context constraints of LLMs and keeping your company data private.
Whether you or your company plan to build or buy with
Generative AI can help make your technology more intuitive and open up new use cases for your products and services.
If you’re interested in exploring how AI could unlock new ideas on your roadmap then get in touch. We’re running free half-day workshops to answer all your questions and collaborate on what opportunities exist. We’re experienced with building, buying and everything in between.