Technology
Vector database
A vector database stores text-as-embeddings (high-dimensional numeric arrays representing meaning) and supports similarity search — 'find documents most similar to this query.' The backbone of RAG and AI search. Top picks: pgvector, Pinecone, Weaviate.
More detail
Embeddings are generated by an embedding model (OpenAI text-embedding-3, Voyage, Cohere). Each chunk of text becomes a ~1500-dimensional vector. The DB indexes these for nearest-neighbour search. For SMB scale (under 100k documents), pgvector (Postgres extension) is plenty. Pinecone is overkill for most businesses and adds a separate vendor relationship.
