codingstairs
NotesEDULifeContact
⌕Search⌘K
koen

Navigation

  • Intro
  • Blog
  • Life

Get in touch

Send without signing in. Add your email if you'd like a reply.

  • Leave a message anonymously →
  • ✉ warragon112@gmail.com
  • KakaoTalk Open Chat ↗

© 2026 codingstairs

  • Notes
  • EDU
  • Search
  • Life
  • Contact
  • Legal
  • RSS
  • GitHub
EDU›Local LLM · pgvector · building a RAG chatbot›Step 3

Step 3

pgvector + HNSW setup

0 views

pgvector + HNSW setup

You need a store for embeddings. Dedicated vector DBs (Pinecone, Qdrant, Weaviate) are great — but if you already run PostgreSQL, pgvector is the simplest.

1. Install

docker run -d --name pgvector \
  -e POSTGRES_PASSWORD=postgres -p 5432:5432 \
  pgvector/pgvector:pg16

Or on an existing PG:

CREATE EXTENSION IF NOT EXISTS vector;

2. Schema

CREATE TABLE IF NOT EXISTS document_chunks (
  id          BIGSERIAL PRIMARY KEY,
  document_id BIGINT NOT NULL,
  chunk_index INT NOT NULL,
  content     TEXT NOT NULL,
  embedding   vector(768),
  created_at  TIMESTAMPTZ DEFAULT now()
);

Change the dim and you must rebuild the index.

3. HNSW vs IVFFlat

Index Build Query Accuracy Good for
HNSW slow fast very high read-heavy RAG · search
IVFFlat fast ok needs tuning bulk load + periodic rebuild

HNSW by default for RAG.

4. Create index

CREATE INDEX ON document_chunks
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);

Build after the full load is faster.

5. Query

SELECT id, content, 1 - (embedding <=> $1::vector) AS similarity
FROM document_chunks
ORDER BY embedding <=> $1::vector
LIMIT 10;
  • <=> cosine distance, <-> L2, <#> negative dot

Must sort ASC by distance for HNSW to engage.

6. Runtime tuning

SET hnsw.ef_search = 100;  -- default 40

Higher = more accurate and slower. Adjust per query requirements.

7. Gotchas

  • Missing CREATE EXTENSION vector;
  • Sorting by similarity DESC — HNSW idle; use distance ASC
  • Dimension mismatch after model change
  • No index — EXPLAIN should show Index Scan

8. Backup notes

pg_dump stores vectors as text. On restore, rebuild HNSW indexes (parallel with pg_restore --jobs).

Closing

pgvector delivers 80% of a vector DB with 20% of the effort. Up to ~100M vectors in a single PostgreSQL is usually fine.

Next

  • 04-rag-pipeline

← Step 2

Embeddings — text to vectors

Step 4 →

RAG pipeline