$ cat /blog/2026-05-27-rag-einfach-erklaert.en.md

RAG explained: Making enterprise knowledge usable with AI

2026-05-27 · by Bits and Friends GmbH · 5 min

[rag] [enterprise-knowledge] [knowledge-management] [embeddings] [vector-search] [confluence] [sharepoint]

“Where is that test report from 2018?” — this question exists in every company. Rarely does anyone know the answer, always the same three senior colleagues whose vacation is dangerously close. This is exactly where RAG steps in: Retrieval Augmented Generation. This article explains what RAG is, what it isn’t, and when it pays off for you.

What RAG means

RAG stands for Retrieval Augmented Generation. In three steps:

Retrieval: The user asks something. The system searches your documents for matching passages — semantically, not just by full-text.
Augmented: The found passages are inserted into the prompt of a language model.
Generation: The language model formulates an answer grounded in these passages.

Result: an answer that doesn’t come from model memory (“what it must sound like”) but from your actual documents (“what is actually documented at your company”).

What RAG is NOT

To prevent misunderstandings:

RAG is not full-text search. A full-text search finds the word “discount” in document 3. RAG also finds “early-payer rebate” in document 17 because semantically similar.
RAG is not model training. Your documents do not flow into model weights. They stay with you; on every request they are read freshly.
RAG is not magic knowledge. What isn’t in any document, RAG won’t find. Gaps in your documentation become visible through RAG — not closed.

When RAG pays off

Concrete scenarios where our customers use RAG productively:

Service desk assistant. “How do I fix issue X for customer Y?” — the agent searches tickets, runbooks, knowledge articles, suggests the historically matching solution. Service desk resolution becomes more consistent.

Onboarding companion. New employees ask “How does our vacation request process work?” or “Which tools do I need for my role?” — the agent pulls the right answer from onboarding docs, HR wiki, internal guides.

Sales preparation. “What did we last deliver to customer Z?” — the agent searches CRM notes, offers, old delivery notes, generates a summary for the next conversation.

Compliance research. “Which NIS-2 requirements apply to our category?” — the agent pulls relevant passages from internal compliance docs + public sources, with reference to originals.

What you need technically

Sources with readable APIs: Confluence, SharePoint, Nextcloud, filesystem, mailboxes, ticket histories — all have interfaces.
An indexing pipeline: Documents are chunked (into ~500-token pieces), translated into vectors by an embedding model, stored in a vector database (pgvector, Qdrant, Weaviate, Chroma).
Hybrid search: Best results come from combining BM25 (full-text) + vector search + re-ranking. We set this up by default because vector alone often misses too much.
A language model for answer synthesis: Local (Gemma, Llama, Mistral via vLLM or Ollama) or API (OpenAI, Anthropic) depending on data class.

The three hard points: access, versions, hallucination

Access model. The agent must respect what the asking person is allowed to see. Whoever has no access to the HR wiki gets no RAG answer from the HR wiki either. We implement ACL respect directly in the retrieval phase.

Version awareness. If document X in 2018 says “We use tool A” and document Y in 2024 “We use tool B”, the agent must prefer the newer or flag both. Version awareness via metadata + recency scoring in re-ranking.

Hallucination guards. The agent must substantiate every statement with a source. If none is found, the agent says so (instead of inventing). Confidence thresholds with answer refusal are mandatory.

Where to start pragmatically

With the use case where data is plentiful and data class is low. Service desk is usually the best entry: many tickets, many solutions in history, clearly documented escalation paths. Onboarding is second choice: less volume but high value per query.

Platform recommendation from us: pgvector if you already have PostgreSQL (one component less), Qdrant if you want to scale.

Next steps

If you want to know whether RAG fits at your company, let’s discuss. Tell us which sources you have and which questions your team asks multiple times daily.

→ Start AI Readiness Check

→ Detail page: RAG & Enterprise Knowledge