← Back to Glossary

Vector Database

A vector database stores embeddings — mathematical representations of meaning — and retrieves results by similarity rather than exact match. It's the infrastructure that makes RAG work, and understanding it explains why AI search feels so different from keyword search.

What is a Vector Database?

A vector database is a specialized data store designed to index and query high-dimensional vectors — arrays of floating-point numbers that represent the semantic content of text, images, audio, or other data. These vectors are called embeddings. When you run a piece of text through an embedding model, you get back a vector where similar meanings map to nearby points in the high-dimensional space. A vector database makes it fast to find vectors that are nearby a given query vector — that’s similarity search.

Popular vector databases include Pinecone, Weaviate, Qdrant, Chroma, and pgvector (a PostgreSQL extension). They differ in deployment model, performance characteristics, and filtering capabilities, but they all solve the same core problem: making approximate nearest neighbor search fast enough to be useful at scale. A traditional SQL database with JSONB columns can store vectors, but querying them at any meaningful volume requires algorithms specifically designed for this problem.

How Vector Search Works

The workflow for vector search has three steps. First, you embed your corpus: every document, product description, support ticket, or knowledge base article gets run through an embedding model (OpenAI’s text-embedding-3, Cohere’s embed, or an open-source model like BGE) and stored as a vector. Second, when a user submits a query, you embed the query using the same model. Third, you find the vectors in your corpus that are most similar to the query vector — typically using cosine similarity or dot product distance — and return those documents.

The key insight is that “similar” here means semantically similar, not lexically similar. A keyword search for “broken login” would miss a ticket that describes “unable to authenticate” without those exact words. A vector search would find it, because the embedding model knows that authentication failure and broken login are semantically close. This is why AI-powered search genuinely improves recall over keyword search in most knowledge base and retrieval applications.

Vector Databases vs Traditional Databases

Traditional relational databases are built for structured queries: find all rows where column A equals X and column B is greater than Y. They’re optimized for exact matches, range queries, and joins across normalized tables. They’re the right tool for transactional workloads — orders, users, inventory — where the structure is known and the queries are precise.

Vector databases are built for a different query type: “find the things most similar to this.” That requires different indexing algorithms (HNSW and IVFFlat are the dominant approaches) that trade some recall accuracy for dramatically faster approximate search. The tradeoff is acceptable for most retrieval applications: finding the top 10 most relevant documents doesn’t require perfect recall, just good enough recall at low latency.

In practice, most production RAG systems combine both: a relational database or document store holds the actual content and metadata, while the vector database holds the embeddings and handles similarity search. You retrieve candidate document IDs from the vector database, then fetch the full content from the relational store. Keeping the two concerns separate makes each easier to maintain and scale independently.

Use Cases for Startups

The highest-value use cases for vector databases in product contexts:

  • Knowledge base search: Internal documentation, support articles, or product manuals that users can query in natural language rather than navigating category trees.
  • RAG for AI assistants: Grounding an LLM’s answers in your specific content — contracts, policies, product specs — rather than its training data alone.
  • Semantic deduplication: Finding near-duplicate records in a database where exact-match deduplication misses paraphrases and abbreviations.
  • Recommendation systems: Recommending similar products, articles, or content based on semantic similarity to what a user has engaged with.
  • Classification at scale: Routing incoming items — emails, tickets, documents — to the right category by finding the closest labeled examples.

For startups at early scale, a managed vector database service (Pinecone, Weaviate Cloud) removes the operational burden of running the infrastructure yourself. pgvector is a reasonable starting point if you’re already running Postgres and your vector volumes are modest — it defers the decision to migrate to a dedicated service until you have the data to justify it.

Related Terms and Concepts

Retrieval-Augmented Generation, LLM, Context Window, Agentic AI, SaaS