How to Choose the Right Vector Store for Your GenAI Offering

Listen to this article:

Pretty much everyone is catching the wave of Generative AI. Startups and legacy enterprises alike are incorporating this powerful technology into existing business plans or building entirely new plans around it. As you consider incorporating GenAI into your own offering, you’ll be confronted with a crucial decision: Which vector store is right for you?

Before diving into the most popular options, let’s be clear about what vector stores are and why they’re important. These specialized databases handle high-dimensional vector data, enhancing data indexing and similarity search capabilities. GenAI systems are seldom composed of the GenAI model itself. For example, many applications benefit from systems that retrieve the most relevant data to be sent as input to a large language model. This process can be done via embeddings, which are semantically meaningful vector representations of text or images (more about this here). Vector databases become indispensable here, as they’re the most efficient way to index and retrieve this data.

At the time of this post’s publication, Loka has engaged in conversations with more than 200 startups, founders and leaders at major companies about their GenAI adoption plans. From these conversations we’ve learned which use cases customers are focusing on, the types of data they’re working with and the requirements for their planned GenAI-powered applications. Critical to many use cases is selecting the right vector database.

Based on the insights we’ve gathered, we’ve defined four key consideration that will help you select the appropriate vector database for your offering.

1. What are your existing stack and training needs?

Integration Ease: Your current technological infrastructure architecture will dictate how smoothly a vector database can be integrated. If your team gained skills by building your company’s stack is built on Postgres, then you have good reason to leverage the PGVector extension. If your team runs their stack in Kubernetes, then deploying a Weaviate server might be the most efficient solution.
Training Needs: For organizations with limited resources, choosing a database that requires extensive training may not be optimal. Evaluate the learning curve associated with each database solution: some might have comprehensive documentation, active communities or even official training courses.

2. Do you have a managed service requirement?

Scalability and Maintenance: We work with many startups that do not have dedicated DevOps teams. In these cases, a service that handles all the scaling and maintenance is a key requirement. This is where managed service options like RDS with PGVector come in, or serverless options like OpenSearch or Pinecone. On the other hand, if you have a dedicated DevOps team, then the most cost-effective option may be to run your own VectorDB server such as Weaviate or Qdrant.
Cost Implications: Managed services might come at a higher price but can still be beneficial because they often include updates, maintenance and support. Weigh the long-term costs of having an in-house team manage the database against the subscription fees of a managed service.

3. How complex is your retrieval?

Performance: One of the key pieces of a vector DB is the approximate nearest neighbors (ANN) method that powers the the similarity search. There are multiple types of ANN algorithms, such as HNSW and IVF. Most vector DBs offer their own customized implementations of these algorithms, which bring a tradeoff between speed and quality of retrieval. If your application doesn’t have a lot of data, you could get away with a lower speed but higher recall option. That may not be the case if you have a lot of data.
Features: Depending on the complexity of your data and the quality of retrieval, you may want to leverage the advanced features that some vector DBs offer beyond the basic top-k retrieval. For example, if you need all the top features available, Weaviate may be a good solution given that its excellent integration with Langchain and Llamaindex make it easy to step up your retrieval by leveraging metadata filtering, hybrid search and reranking. But if all you need is top-k retrieval, then a simpler option like ChromaDB may be sufficient.

4. What’s your usage pattern?

Real Time: This is the typical scenario for vector DBs in GenAI applications. A common example is the LLM-powered chatbots that need to retrieve up to date information from a knowledge base. These cases require a vector DB available 24/7, so a vector DB server such as Weaviate or RDS with PGVector can be a god solution.
Batch: Another common use case is LLM-powered batch workflows that benefit from having RAG-like capabilities and therefore require a vector DB. The key difference here is that these do not run 24/7; they may run only a couple of times a day, making a 24/7 server a waste of resources. In these cases, vector DBs that run in-memory such as ChromaDB can be a cost-effective solution. Another viable option is a serverless vector DB such as Open Search by AWS.

*The graph above shows how Loka chose different vector DB solutions based on specific customer requirements.*

In conclusion, choosing a vector store is not about picking the most popular option but rather finding the one that aligns with your organization's specific needs. Addressing these four considerations will position you to make an informed choice and propel your AI and machine learning projects to success.

Check out Loka’s GenAI Workshop to start putting your GenAI plan into action.

Artificial Intelligence

December 4, 2023

Loka Staff

Telmo Felgueira

Senior ML Engineer

AND

Thomas Cummins

Principal Solutions Architect at Loka

AND