What makes vector similarity special?
Short answer: Embeddings. The technique to translate a piece of content into vector representation is called embeddings. They allow you to analyze the semantic content mathematically. The LLMs are capable of turning our content into vector representation, and embed them into the vector space, where similarity is concluded based on the distance between two embeddings. These embeddings are to be stored on vector databases. In this article, we will use Supabase and enable pgvector to store vectors.Overview of our app
Our app will utilize the Supabase vector database to maintain articles in the form of embeddings. Upon receiving a new query, the database will intelligently recommend the most relevant article. This is how the process will work:- The application will read a text file containing a list of article titles.
- It will then use OpenAI models through Portkey to convert the content into embeddings.
- These embeddings will be stored in pgvector, along with a function that enables similarity matching.
- When a user enters a new query, the application will return the most relevant article based on the similarity match database function.
Setup
Get going by setting up 3 things for this tutorial — NodeJS project, Portkey and Supabase. Portkey- Sign up and login into Portkey dashboard.
- your OpenAI API key and add it to Portkey Vault.
package.json
. Since we want to store the list of articles to database, we have to read them for a file. Create articles.txt
and copy the following:
index.js
and you are ready. Let’s start writing code.
Step 1: Importing and authenticating Portkey and Supabase
Since our app is set to interact with OpenAI (via Portkey) and Supabase pgvector database, let’s import the necessary SDK clients to run operations on them.fs
to help us read the list of articles from a articles.txt
file and USER_QUERY
is the query we will use to do similarity search.
Step 2: Create a Table
We can use the SQL Editor to execute SQL queries. We will have one table for this project, and let’s call itsupport_articles
table. It will store the title of the article along with it’s embeddings. Please feel free to add more fields of your choice, such as description or tags.
For simplicity, create a table with columns for ID
, content
, and embedding
.
Step 3: Read, Generate and Store embeddings
We will use thefs
library to read the articles.txt
and convert every title on the list into embeddings. With Portkey, generating embeddings is straightforward and same as working with OpenAI SDK and no additional code changes required.
await storeSupportArticles();
You should now see the rows created from the Table Editor.
Step 4: Create a database function to query similar match
Next, let’s set up a database function to do vector similarity search using Supabase. This database function will take an user query vector as argument and return us an object with theid
, content
and the similarity
score against the best row and user query in the database.
support_articles
is now powered to return vector similarity search operations.
No more waiting! Let’s run an search query.