jaguardb

JaguarDB

The Most Scalable Vector Database

Home

Technology

Product

Document

Download

Vector database sharding

Multimodal search

JaguarDB quantization

JaguarDB Vector API

Best Vector databases

JaguarDB in Docker

Setup JaguarDB with tar package

Setup JaguarDB on multiple nodes

Vector index sharing

How zeromove works

Video introduction

Example: RAG and Chat

Retrieval-Augmented Generation (RAG) emerges as a groundbreaking solution to address the limitations of Large Language Models (LLMs), particularly their propensity to produce hallucinated or factually inaccurate content. RAG operates by ingeniously blending the strengths of retrieval-based models, which excel in sourcing and providing accurate, context-specific information from a vast dataset, with the innovative capacity of generative models known for their ability to create fluent and coherent responses. In this synergistic framework, the retrieval component first fetches the most relevant information pertinent to a given query or context. This information is then seamlessly integrated into the generative model's process, guiding it to produce responses that are not only creative and contextually coherent but also anchored in factual accuracy. Consequently, RAG significantly enhances the reliability and quality of the outputs from LLMs, making them more effective for investment tasks requiring high factual correctness and detailed context understanding.

The following Python example illustrates documents broken into chunks that are stored in JaguarDB which compliments the LLM for answering questions from users. The texts in JagaurDB are fed to the LLM which will analyze all available data and provide a coherent and logical final answer.

The following Python code splits a document into chunks:

loader = TextLoader('./state_of_the_union.txt')
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=300)
docs = text_splitter.split_documents(documents)

Next, we initialize a Jaguar client object:

url = 'http://192.168.3.88:8080/fwww/'
embeddings = OpenAIEmbeddings()
pod = 'vdb'
store = 'langchain_rag_store'
vector_index = 'v'
vector_type = 'cosine_fraction_float'
vector_dimension = 1536
vectorstore = Jaguar(pod, store, vector_index,
vector_type, vector_dimension, url, embeddings
)
vectorstore.login()

Now we can create the vector store on the database. This should be called only once. Metadata is the extra data fields in addition to the vector data. We can think of the metadata as columns in relational database systems. Since we are capturing the text info from the documents, we need to give the record size of the text fields to store in the store. The size can be only large enough to store one chunk of text.

metadata = 'category char(16)'
text_size = 1024
vectorstore.create(metadata, text_size)

With the documents and a vector store ready, we can add the documents into the Jaguar store and get a retriever object. If we want to search texts using filters on metadata, we can pass the search_kwargs and where condition to get the retriever.

vectorstore.add_documents(docs)
retriever = vectorstore.as_retriever()
#retriever = vectorstore.as_retriever(search_kwargs={"where": "a='123' and b='xyz'"})

Then we can create a chain and ask LLM some questions:

template = '''You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.
Question: {question}
Context: {context}
'''
prompt = ChatPromptTemplate.from_template(template)
LLM = ChatOpenAI(model_name='gpt-4', temperature=0)
rag_chain = ({"context": retriever, "question": RunnablePassthrough()} | prompt | LLM | StrOutputParser() )
query = 'What did the president say about Justice Breyer?'
print(f"Question: {query} ")
r = rag_chain.invoke(query)
print("Answer:")
print(r)

The above example depends on the langchain stack and the jaguar.py store file. You can visit github.com/fserv/jaguar-sdk RAG directory for a complete example.

JaguarDB

JaguarDB offers comprehensive support for vector database in artificial intelligence, along with instantly scalable datalake storage for raw media files and robust similarity search capabilities. This facilitates efficient handling of large datasets and enhances AI applications that require rapid data retrieval and similarity comparisons. JaguarDB, with integrated features, provides a seamless solution for managing and analyzing complex data in AI-driven environments.

Products

AI VectorDB
AI Datalake
Time Series
Geospatial
JaguarDB
Client Drivers

Resources

Cloud Admin Manual
Developer Guide
Configuration Help
Frequent Questions
ZeroMove Demo
Video Introduction

Social

Youtube