// ls categories/ai-engineering/
Ai-Engineering
Chunking Strategies for Retrieval
Why chunk size is the most undertuned variable in RAG, how recursive, semantic, and structural chunking differ, and when parent-document retrieval beats them all.
Vector Databases & ANN Indexes
How HNSW, IVF, and ScaNN trade recall for speed, why exact KNN doesn't scale, and how to choose between pgvector, Qdrant, and Pinecone for production use.
Text Embeddings: Turning Meaning into Geometry
How embedding models encode text as dense vectors, why cosine similarity measures semantic distance, and how to build semantic search in Python and TypeScript.
LLM Inference: Tokens, Context, and Sampling
How LLMs actually process text: tokenization with BPE, the context window as working memory, KV caching, and sampling parameters that control output variance.