⚡ ZeusDB Vector Store
ZeusDB is a high-performance, Rust-powered vector database with enterprise features like quantization, persistence and logging.
This notebook covers how to get started with the ZeusDB Vector Store to efficiently use ZeusDB with LangChain.
Setup
Install the ZeusDB LangChain integration package from PyPi:
pip install -qU langchain-zeusdb
Setup in Jupyter Notebooks
💡 Tip: If you’re working inside Jupyter or Google Colab, use the %pip magic command so the package is installed into the active kernel:
%pip install -qU langchain-zeusdb
Getting Started
This example uses OpenAIEmbeddings, which requires an OpenAI API key – Get your OpenAI API key here
If you prefer, you can also use this package with any other embedding provider (Hugging Face, Cohere, custom functions, etc.).
Install the LangChain OpenAI integration package from PyPi:
pip install -qU langchain-openai
# Use this command if inside Jupyter Notebooks
#%pip install -qU langchain-openai
Please choose an option below for your OpenAI key integration
Option 1: 🔑 Enter your API key each time
Use getpass in Jupyter to securely input your key for the current session:
import os
import getpass
os.environ['OPENAI_API_KEY'] = getpass.getpass('OpenAI API Key:')
OpenAI API Key: ········
Option 2: 🗂️ Use a .env file
Keep your key in a local .env file and load it automatically with python-dotenv
from dotenv import load_dotenv
load_dotenv() # reads .env and sets OPENAI_API_KEY
🎉🎉 That's it! You are good to go.
1. Initialization
# Import required Packages and Classes
from langchain_zeusdb import ZeusDBVectorStore
from langchain_openai import OpenAIEmbeddings
from zeusdb import VectorDatabase
# Initialize embeddings
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
# Create ZeusDB index
vdb = VectorDatabase()
index = vdb.create(
index_type="hnsw",
dim=1536,
space="cosine"
)
# Create vector store
vector_store = ZeusDBVectorStore(
zeusdb_index=index,
embedding=embeddings
)
2. Manage Vector Store
2.1 Add items to vector store
from langchain_core.documents import Document
document_1 = Document(
page_content="ZeusDB is a high-performance vector database",
metadata={"source": "https://docs.zeusdb.com"}
)
document_2 = Document(
page_content="Product Quantization reduces memory usage significantly",
metadata={"source": "https://docs.zeusdb.com"}
)
document_3 = Document(
page_content="ZeusDB integrates seamlessly with LangChain",
metadata={"source": "https://docs.zeusdb.com"}
)
documents = [document_1, document_2, document_3]
vector_store.add_documents(documents=documents, ids=["1", "2", "3"])
['1', '2', '3']
2.2 Update items in vector store
updated_document = Document(
page_content="ZeusDB now supports advanced Product Quantization with 4x-256x compression",
metadata={"source": "https://docs.zeusdb.com", "updated": True}
)
vector_store.add_documents([updated_document], ids=["1"])
['1']
2.3 Delete items from vector store
vector_store.delete(ids=["3"])
True
3. Query Vector Store
3.1 Query directly
Performing a simple similarity search:
results = vector_store.similarity_search(
query="high performance database",
k=2
)
for doc in results:
print(f"* {doc.page_content} [{doc.metadata}]")
* ZeusDB now supports advanced Product Quantization with 4x-256x compression [{'source': 'https://docs.zeusdb.com', 'updated': True}]
* Product Quantization reduces memory usage significantly [{'source': 'https://docs.zeusdb.com'}]
If you want to execute a similarity search and receive the corresponding scores:
results = vector_store.similarity_search_with_score(
query="memory optimization",
k=2
)
for doc, score in results:
print(f"* [SIM={score:.3f}] {doc.page_content} [{doc.metadata}]")
* [SIM=0.501] Product Quantization reduces memory usage significantly [{'source': 'https://docs.zeusdb.com'}]
* [SIM=0.763] ZeusDB now supports advanced Product Quantization with 4x-256x compression [{'source': 'https://docs.zeusdb.com', 'updated': True}]
3.2 Query by turning into retriever
You can also transform the vector store into a retriever for easier usage in your chains:
retriever = vector_store.as_retriever(
search_type="mmr",
search_kwargs={"k": 2}
)
retriever.invoke("vector database features")
[Document(id='1', metadata={'source': 'https://docs.zeusdb.com', 'updated': True}, page_content='ZeusDB now supports advanced Product Quantization with 4x-256x compression'),
Document(id='2', metadata={'source': 'https://docs.zeusdb.com'}, page_content='Product Quantization reduces memory usage significantly')]
4. ZeusDB-Specific Features
4.1 Memory-Efficient Setup with Product Quantization
For large datasets, use Product Quantization to reduce memory usage:
# Create memory-optimized vector store
quantization_config = {
'type': 'pq',
'subvectors': 8,
'bits': 8,
'training_size': 10000
}
vdb_quantized = VectorDatabase()
quantized_index = vdb_quantized.create(
index_type="hnsw",
dim=1536,
quantization_config=quantization_config
)
quantized_vector_store = ZeusDBVectorStore(
zeusdb_index=quantized_index,
embedding=embeddings
)
print(f"Created quantized store: {quantized_index.info()}")
Created quantized store: HNSWIndex(dim=1536, space=cosine, m=16, ef_construction=200, expected_size=10000, vectors=0, quantization=pq(subvectors=8, bits=8, untrained, inactive, compression=768.0x))
4.2 Persistence
Save and load your vector store to disk:
How to Save your vector store
# Save the vector store
vector_store.save_index("my_zeusdb_index.zdb")
True
How to Load your vector store
# Load the vector store
loaded_store = ZeusDBVectorStore.load_index(
path="my_zeusdb_index.zdb",
embedding=embeddings
)
print(f"Loaded store with {loaded_store.get_vector_count()} vectors")
Loaded store with 2 vectors
Usage for Retrieval-Augmented Generation (RAG)
For guides on how to use this vector store for retrieval-augmented generation (RAG), see the following sections:
API reference
For detailed documentation of all ZeusDBVectorStore features and configurations head to the Doc reference: https://docs.zeusdb.com/en/latest/vector_database/integrations/langchain.html
Related
- Vector store conceptual guide
- Vector store how-to guides